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Description 

Technical Field 

5 [0001] The present invention relates to new isolates of the viral class Hepatitis C, polypeptides, polynucleotides and 
antibodies derived therefrom, as well as the use of such polypeptides, polynucleotides and antibodies in assays (e.g., 
immunoassays, nucleic acid hybridization assays, eta) and in the production of viral polypeptides. 

Background 

io 

[0002] Non-A, Non-B hepatitis (NANBH) is a transmissible disease or family of diseases that are believed to be viral- 
induced, and that are distinguishable from other forms of viral-associated liver diseases, including that caused by the 
known hepatitis viruses, i.e., hepatitis A virus (HAV), hepatitis B virus (HBV), and delta hepatitis virus (HDV), as well as 
the hepatitis induced by cytomegalovirus (CMV) or Epstein-Ban* virus (EBV). NANBH was first identified in transfused 

is individuals. Transmission from man to chimpanzee and serial passage in chimpanzees provided evidence that NANBH 
is due to a transmissible infectious agent or agents. Epidemiologic evidence is suggestive that there may be three types 
of NANBH: the water-borne epidemic type; the blood or needle associated type; and the sporadically occurring (com- 
munity acquired) type. However, until recently, no transmissible agent responsible for NANBH had not been identified. 
[0003] Clinical diagnosis and identification of NANBH has been accomplished primarily by exclusion of other viral 

20 markers. Among the methods used to detect putative NANBH antigens and antfcodies are agar-gel diffusion, counter- 
irnmunoelectrophoresis, immunofluorescence microscopy, immune electron microscopy, radioimmunoassay, and 
enzyme-linked immunosorbent assay. However, none of these assays has proved to be sufficiently sensitive, specific, 
and reprodutible to be used as a diagnostic test tor NANBH. 

[0004] Until recently there has been neither clarity nor agreement as to the identity or specificity of the antigen anti- 
cs body systems associated with agents of NANBH. tt is possfcJe that NANBH is caused by more than one infectious agent 
and unclear what the serological assays detect in the serum of patients with NANBH. 

[0005] In the past a number of candidate NANBH agents were postulated. See, ag., Prince (1983) Ann. Rev. Micro- 
biol. 3Z:217; Feinstone & Hoofnagje (1984) New Eng. J. Med. 311:185; Overby (1985) Curr. Heptol. 5:49; Overby 
(1986) Curr. Heptol. 6:65; Overby (1987) Curr. Heptol. £35; and twarson (1987) British Med. J. 225:946. However, there 

30 is no proof that any of these candidates represent the etiological agent of NANBH. 

[0006] In 1987, Houghton et al. cloned the first virus definitively linked to NANBH. See, e.g.. EPO Pub. No. 318.216; 
Houghton et al., Science 244:359 (1989). Houghton et al. described therein the cloning of an isolate from a new viral 
class, hepatitis C virus (HCV), the prototype isolate described therein being named "HCVV. HCV is a Ravi-like virus, 
with an RNA genome. Houghton et al. described the production of recombinant proteins from HCV sequences that are 

35 useful as diagnostic reagents, as well as polynucleotides useful in diagnostic hybridization assays and in the cloning of 
additional HCV isolates. 

[0007] The demand for sensitive, specific methods for screening and identifying carriers of NANBH and NANBH con- 
taminated blood or blood products is significant Post-transfusion hepatitis (PTH) occurs in approximately 10% of trans- 
fused patients, and NANBH accounts for up to 90% of these cases. There is a frequent progression to chronic liver 
40 damage (25-55%). 

[0008] Patient care as well as the prevention of transmission of NANBH by blood and Wood products or by ctose per- 
sonal contact require reliable diagnostic and prognostic tods to detect nucleic acids, antigens and anffixxfies related to 
NANBH. In addition, there is also a need for effective vaccines and immunotherapeutic therapeutic agents for the pre- 
vention and/a treatment of the disease. 
45 [0009] While at least one HCV isolate has been identified which is useful in meeting the above needs, additional iso- 
lates, particularly those with divergent a genome, may prove to have unique applications. 

Summary of the Invention 

so [00101 New isolates of HCV has been characterized from Japanese blood donors who have been implicated as 
NANBH carriers. These isolates exhibit nucleotide and amino acid sequence heterogeneity with respect to the proto- 
type isolate, HCV1 , in several viral domains, ft is believed that these distinct sequences are of in importance, particu- 
larly in diagnostic assays and in vaccine development 

[001 1] In one embodiment, the present invention provides a ONA molecule comprising a nucleotide sequence of at 
55 least 15 bp from an HCV isolate substantially homologous to an isolate selected from the group J 1 or J7, wherein said 
nucleotide sequence is distinct from the nucleotide sequence of HCV isolate HCV1. 

[001 2] In another embodiment, the present invention provides a DNA molecule comprising a nucleotide sequence of 
at least 15 bp encoding an amino acid sequence from a HCV isolate J1 or J7 wherein the J1 a J7 amino acid sequence 
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is distinct from the amino acid sequence of HCV isolate HCV1 . 

[001 3] Yet another embodiment of the present invention provides a purified polypeptide comprising an amino acid 
sequence from an HCV isolate substantially homologous to an isolate selected from the group J1 and J7. wherein said 
amino acid sequence is distinct from the sequence of the polypeptides encoded by the HCV isolate HCV1 . 
5 [001 4] Still another embedment of the present invention provides a polypeptide comprising an amino acid sequence 
from a HCV isolate J 1 or J7 wherein the J1 or J7 amino acid sequence is distinct from the amino acid sequence of HCV 
isolate HCV1 and the polypeptide is immobilized on a solid support 

[001 5] In a further embodiment of the present invention, an immunoassay for detecting the presence of anti-HCV anti- 
bodies in a test sample is provided comprising: (a) incubating the test sample under conditions that allow the formation 
io of aritigen-antfoody complexes with an immunogenic polypeptide comprising an amino acid sequence from an HCV iso- 
late substantially homologous to an isolate selected from the group J1 and J7, wherein the amino acid sequence is dis- 
tinct from the amino acid sequence of HCV isolate HCV1; and (b) detecting an antigen-antfoody complex comprising 
the immunogenic potypeptida 

[001 6] The present invention also provides a composition comprising anti-HCV antibodies that bind an HCV epitope 
15 substantially free of antibodies that do not bind an HCV epitope, wherein: (a) the HCV epitope comprises an amino acid 
sequence from an HCV isolate substantially homologous to an isolate selected from the group J1 and J7, wherein the 
amino acid sequence is distinct from the amino acid sequence of HCV isolate HCV1; and (b) the J1 or J7 amino acid 
sequence is not immunologically cross-reactive with HCV1 . 

[001 7] A further embodiment of the present invention provides an immunoassay for detecting the presence of an HCV 
20 polypeptide in a test sample comprising: (a) incubating the test sample under conditions that allow the formation of anti- 
gen-antibody complexes with anti-HCV antibodies that bind an HCV epitope wherein: (t) the HCV epitope comprises an 
amino acid sequence from a HCV isolate J 1 or J7; (ii) the J1 or J7 amino acid sequence is distinct from the amino acid 
sequence of HCV isolate HCV1 ; and (iii) the J1 or J7 amino acid sequence is not immunologically cross-reactive with 
HCV1 ; and (b) detecting an antigen-antibody complex comprising the anti-HCV antibodies. 
25 [001 8] Also provided by the present invention is a method of producing anti-HCV antfoodies comprising administering 
to a mammal a polypeptide comprising an amino acid sequence from a HCV isolate J 1 or J7 wherein the J 1 orJ7 amino 
acid sequence is distinct from the amino acid sequence of HCV isolate HCV1 whereby the mammal produces anti-HCV 
antibodies. 

[001 9] Yet another embodiment of the present invention provides a method of detecting HCV polynucleotides in a test 
30 sample comprising: (a) providing a probe comprising the DMA molecule of daim 1 ; (b) contacting the test sample and 
the probe under conditions that allow for the formation of a polynucleotide duplex between the probe and its comple- 
ment in the absence of substantial polynucleotide duplex formation between the probe and non-HCV polynucleotide 
sequences present in the test sample: and (c) detecting any polynucleotide duplexes comprising the probe. 
[0020] A still further embodiment of the present invention provides a method of producing a recombinant polypeptide 
35 comprising an HCV amino acid sequence, the method comprising: (a) providing host cells transformed by a DMA con- 
struct comprising a control sequences for the host cell operabty linked to a coding sequence encoding an amino acid 
sequence from a HCV isolate J1 or J7 wherein the J1 or J7 amino acid sequence is distinct from the amino acid 
sequence of HCV isolate HCV1 ; (b) growing the host ceils under conditions whereby the coding sequence transcribed 
and translated into the recombinant polypeptide; and (c) recovering the recombinant polypeptide. 
40 [0021 ] These and other embodiments of the present invention will be readily apparent to those of ordinary skill in the 
art in view of the following description. 

Brief Description of the Figures 

45 [0022] 

Figure 1 shows the consensus sequence of the coding strand of a fragment from the J7 C/E domain with the het- 
erogeneities. 

Figure 2 shows the consensus sequence of the coding strand of a fragment from the J1 E domain with the hetero- 
so generties. 

Figure 3 shows the consensus sequence of the coding strand of a fragment of the J1 E/NS1 domain with the het- 
erogeneities. 

Figure 4 shows the consensus sequence of the coding strand of a fragment from the J1 NS3 domain with the het- 
erogeneities. 

55 Figure 5 shows the consensus sequence of the coding strand of a fragment from the J1 NS5 domain with the het- 
erogeneities. 

Figure 6 shows the homology of the J7 C/E consensus sequence with the nucleotide sequence of the same domain 
fromHCVI. 
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Figure 7 shows the homology of the J 1 E consensus sequence with the nucleotide sequence of the same domain 
fromHCVI. 

Figure 8 shows the homology of the J1 E/NS1 consensus sequence with the nucleotide sequence of the same 
domain from HCV1 . 

Rgure 9 shows the homology of the J 1 NS3 consensus sequence with the nucleotide sequence of the same 
domain fromHCVI. 

Rgure 10 shows the homology of the J1 NS5 consensus sequence with the nucleotide sequence of the same 
domain fromHCVI. 

Figure 1 1 shows the putative genomic organization of the HCV1 genome. 

Figure 12 shows the nucleotide sequence of the ORF of HCV1. In the figure nucleotide number 1 is the first A of 
the putative initiating methionine of the large ORF; nucleotides upstream of this nucleotide are numbered with neg- 
ative numbers. 

Figure 13 shows the consensus sequence of the coding strand of a fragment from the J1 NS1 domain (J1 1519) 
with the nucleotide sequence of the same domain from HCV1 . Also shown are the amino acids encoded therein. 
Figure 1 4 shows a composite of the consensus sequence from the core to the NS1 domain of J 1 with the nucleotide 
sequence of the same domain from HCV1 . Also shown are the amino acids encoded therein. 
Figure 15 shows a consensus sequence of the coding strand of the NS1 domain of J1 , as determined in Example 

IV. Also shown are the nucleotide sequence of the same domain from HCV1 , and the amino acids encoded in the 
HCV1 and J1 sequences. 

Rgure 1 6 shows a consensus sequence of a coding strand of the C200 region of the NS3-NS4 domain of J1 . Also 
shown are the nucleotide sequence of the same domain from HCV1 . Also shown are the amino acids encoded in 
the sequences. 

Figure 1 7 shows a consensus sequence of the coding strand of the NS1 domain of J1, as determined in Example 

V. Also shown are the nucleotide sequence of the same domain from HCV1, and the amino acids encoded in the 
sequences. 

Rgure 18 shows a consensus sequence of the cooing strand of the untranslated and core domains of J1. Also 
shown are the nucleotide sequence of the same domain from HCV1, and the amino acids encoded in the 
sequences. 

Detailed Description of the Invention 

[0023] The practice of the present invention will employ, unless otherwise indicated, conventional techniques of 
molecular biology, microbiology, recombinant DNA techniques, and immunology, which are within the stall of the art. 
Such techniques are explained fully in the literature. Sfifi e.g., Maniatis, Frtsch & Sambrook, MOLECULAR CLONING; 
A LABORATORY MANUAL (1982); DNA CLONING. VOLUMES I AND II (D.N Glover ed. 1985); OLIGONUCLEOTIDE 
SYNTHESIS (M.J. Gait ed. 1984); NUCLEIC ACID HYBRIDIZATION (B.D. Hames & S.J. Higgins eds. 1984); TRAN- 
SCRIPTION AND TRANSLATION (B.D. Hames & S.J. Higgins eds. 1984); ANIMAL CELL CULTURE (R.I. Freshney ed. 
1986); IMMOBILIZED CELLS AND ENZYMES (IRL Press. 1986); & Perbal, A PRACTICAL GUIDE TO MOLECULAR 
CLONING (1984); the series, METHODS IN ENZYMOLOGY (Academic Press. Inc.); GENE TRANSFER VECTORS 
FOR MAMMALIAN CELLS (J.H. Miller and M.R Calos eds. 1987, Cold Spring Harbor Laboratory). Methods in Enzy- 
mology Vol. 154 and Vol. 155 (Wu and Grossman, and Wu, eds.. respectively). Mayer and Walker, eds. (1987), IMMU- 
NOCHEMICAL METHODS IN CELL AND MOLECULAR BIOLOGY (Academic Press. London), Scopes, (1987). 
PROTEIN PURIFICATION: PRINCIPLES AND PRACTICE. Second Edition (Springer-Veriag. N.Y.), and HANDBOOK 
OF EXPERIMENTAL IMMUNOLOGY, VOLUMES l-IV (D.M. Weir and C. C. Blackwell eds 1986). All patents, patent 
applications, and other publications mentioned herein, both supra and infra, are hereby incorporated herein by refer- 
ence. 

[0024] The term "hepatitis C virus" has been reserved by workers in the field for an heretofore unknown etiologic 
agent of NANBH. Accordingly, as used herein, "hepatitis C virus" (HCV) refers to an agent causative of NANBH. which 
was formerly referred to as NANBV and/a BB-NANBV from the class of the prototype isolate, HCV1, described by 
Houghton et al. See. ag.. EPO Pub. No. 318.216 and US. Patent App. Serial No. 355,002, filed 19 May 1989 (available 
in non-US. applications claiming priority therefrom), the disclosures of which are incorporated herein by reference The 
nucleotide sequence and putative amino acid sequence of HCV1 is shown in Figure 6. The terms HCV, NANBV and 
BB-NANBV are used interchangeably herein. As an extension of this terminology, the disease caused by HCV. formerly 
called NANB hepatitis (NANBH). is called hepatitis C. The terms NANBH and hepatitis C may be used interchangeably 
herein. The term "HCV", as used herein, denotes a viral species of which pathogenic strains cause NANBH, as well as 
attenuated strains or defective interfering particles derived therefrom. 

[0025] HCV is a FlavHike virus. The morphology and composition of Flavivirus particles are known, and are dis- 
cussed by Brinton (1986) THE VIRUSES: THE TOGAVIRIDAE AND FLAVIVIRIDAE (Series eds. Fraenkel-Conrat and 
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Wagner, vol eds. Schlesinger and Schlesinger, Plenum Press), p.327-374. Generally, with respect to morphology. Fla- 
viviruses oontain a central nucleocapsid surrounded by a lipid bilayet virions are spherical and have a diameter of 
about 40-50 nm. Their cores are about 25-30 nm in diameter. Along the outer surface of the virion envelope are projec- 
tions that are about 5-1 0 nm long with terminal knobs about 2 nm in diameter. 
5 [0026] The HCV genome is comprised of RNA. It is known that RNA containing viruses have relatively high rates of 
spontaneous mutation, i.a, reportedly on the order of 10* 3 to 10* 4 per incorporated nucleotide. Therefore, there are mul- 
tiple strains, which may be virulent or avirulent. within the HCV class or species. 

[0027] It is believed that the genome of HCV isolates is comprised of a single ORF of approxim at ely 9,000 nucleotides 
to approximately 12,000 nucleotides, encoding a polyprotein similar in size to that of HCV1, an encoded polyprotein of 
10 similar hydrophobic and antigenic character to that of HCV1 , and the presence of co-linear peptide sequences that are 
conserved with HCV1 . In addition, the genome is believed to be a positive-stranded RNA. 

[0028] Isolates of HCV comprise epitopes that are immunologicalty cross-reactive with epitopes in the HCV1 genome. 
At least some of these are epitopes unique to HCV when compared to other known Raviviruses. The uniqueness of the 
epitope may be determined by its irrtmunotogicaJ reactivity with anti-HCV antibodies and lack of immunological reactivity 
15 with antibodies to other RavMrus species. Methods for determining immunological reactivity are known in the art for 
example, by radioimmunoassay, by ELISA assay, by hemagglutination, and several examples of suitable techniques for 
assays are provided herein. 

[0029] It is also expected that the overall homology of HCV isolates and HCV1 genomes at the nucleotide level prob- 
ably wi Q be about 40% or greater, probably about 60% a greater, and even more probably about 80% to about 90% or 

20 greater. In addition that there are many corresponding contiguous sequences of at least about 13 nucleotides that are 
fully homologous. The correspondence between the sequence from a new isolate and the HCV1 sequence can be 
determined by techniques known in the art For example, they can be determined by a direct comparison of the 
sequence information of the polynucleotide from the new isolate and HCV1 sequences. Alternatively, homology can be 
determined by hybridization of the polynucleotides under conditions which form stable duplexes between homologous 

25 regions (for example, those which would be used prior to digestion), followed by digestion with single-stranded spe- 
cific nudease(s), followed by size determination of the digested fragments. 

[0030] Because of the evolutionary relationship of the strains or isolates of HCV, putative HCV strains or isolates are 
identifiable by their homology at the polypeptide level. Thus, new HCV isolates are expected to be more than about 40% 
homologous, probably more than about 70% homologous, and even more probably more than about 80% homologous, 
so and posstoiy even more than about 90% homologous at the polypeptide level. The techniques for determining amino 
acid sequence homology are known in the art For example, the amino add sequence may be determined directly and 
compared to the sequences provided herein. Alternatively the nucleotide sequence of the genomic material of the puta- 
tive HCV may be determined, the amino acid sequence encoded therein can be determined, and the corresponding 
regions compared. 

35 [0031] The ORF of HCV1 is shewn in Rgure 12. The non-structural, core, and envelope domains of the polyprotein 
have been predicted for HCV1 (Rgure 5). The "C, or core, polypeptide is believed to be encoded from the 5' terminus 
to about nucleotide 345 of HCV1 . The putative "E", or envelope, domain of HCV1 is believed to be encoded from about 
nucleotide 346 to about nucleotide 1050. Putative NS1 . or non-structural one domain, is thought to be encoded from 
about nucleotide 1051 to about nucleotide 1953. For the remaining domains, putative NS2 is thought to be encoded 

40 from about nucleotide 1954 to about nucleotide 3018, putative NS3 from about nucleotide 3019 to about nucleotide 
4950. putative NS4 from about nucleotide 4951 to about nucleotide 6297. and putative NS5 from about nucleotide 6298 
to the 3* terminus respectively. The above boundaries are approximations based on an analysis of the ORF. The exact 
boundaries can be determined by those skilled in the art in view of the disclosure herein. 

[0032] "HCV/jr or -J1" and "HCV/J7" or "J7* refer to new HCV isolates characterized by the nucleotide sequence 
45 disclosed herein, as well as related isolates that are substantially homologous thereto; i.e.. at least about 90% or about 
95% at the nucleotide level. It is believed that the sequences disclosed herein characterize an HCV subclass that is pre- 
dominant in Japan and other Asian and/or Pacific rim countries. Additional J1 and J7 isolates can be obtained in view 
of the disclosure herein and EPO Pub. No. 318,216. In particular, the J1 and J7 nucleotide sequences disclosed herein, 
as well as the HCV1 sequences in Rgure 12, can be used as primers or probes to done additional domains of J1 , J7, 
so or additional isolates. 

[0033] As used herein, a nucleotide sequence "from" a designated sequence or source refers to a nucleotide 
sequence that is homologous (i.e., identical) to or complementary to the designated sequence or source, or a portion 
thereof. The J1 sequences provided herein are a minimum of about 6 nucleotides, preferably about 8 nucleotides, more 
preferably about 15 nucleotides, and most preferably 20 nucleotides or longer. The maximum length is the complete 
55 viral genoma 

[0034] In some aspects of the invention, the sequence of the region from which the polynucleotide is derived is pref- 
erably homologous to or complementary to a sequence which is unique to an HCV genome or the J 1 and J7 genome. 
Whether or not a sequence is unique to a genome can be determined by techniques known to those of skill in the art 



5 



EP0 939 128 A2 



For example, the sequence can be compared to sequences in databanks, e.g., Genebank, to determine whether it is 
present in the uninfected host or other organisms. The sequence can also be compared to the known sequences of 
other viral agents, including those which are known to induce hepatitis, e.g., HAV. HBV, and HDV, and to other members 
of the Ravrviridae. The correspondence or non-correspondence of the derived sequence to other sequences can also 

5 be determined by hybridization under the appropriate stringency conditions. Hybridization techniques for determining 
the complementarity of nucleic acid sequences are known in the art See also, for example, Maniatis et al. (1982) 
MOLECULAR CLONING; A LABORATORY MANUAL (Cold Spring Harbor Press, Cold Spring Harbor, N.Y). In addi- 
tion, mismatches of duplex polynucleotides formed by hybricfizatjon can be determined by known techniques, including 
for example, digestion with a nuclease such as S1 that specifically digests single-stranded areas in duplex potynude- 

ro otides. Regions from which typical DNA sequences may be derived include, but are not limited to, regions encoding 
specific epitopes, as well as non-transcribed and/or non-translated regions. 

[0035] TheJI of J7 polynucleotide is not necessarily physically derived from the nucleotide sequence shown, but may 
be generated in any manner, including for example, chemical synthesis or DNA replication or reverse transcription or 
transcription. In addition, combinations of regions corresponding to that of the designated sequence may be modified 
is in ways known in the art to be consistent with an intended use. The polynucleotides may also include one or more 
labels, which are known to those of skill in the art 

[0036] An amino acid sequence "from" a designated polypeptide or source of polypeptides means that the amino acid 
sequaice is homologous (i.e., identical) to the sequence of the designated polypeptide, or a portion thereof. An amino 
acid sequence "from" a designated nucleic acid sequence refers to a polypeptide having an amino acid sequence iden- 

20 tical to that of a polypeptide encoded in the sequence, or a portion thereof. The J1 or J7 amino acid sequences in the 
polypeptides of the present invention are at least about 5 amino acids in length, preferably at least about 10 amino 
acids, more preferably at least about 1 5 amino acids, and most preferably at least about 20 amino acids. 
[0037] The polypeptides of the present invention are not necessarily translated from a designated nucleic acid 
sequence; the polypeptides may be generated in any manner; including for example, chemical synthesis, or expression 

25 of a recombinant expression system, or isolation from virus. The polypeptides may include one or more analogs of 
amino acids or unnatural amino acids. Methods of inserting analogs of amino acids into a sequence are known in the 
art The polypeptides may also include one or more labels, which are known to those of skill in the art. 
[0038] The term "recombinant polynucleotide" as used herein intends a polynucleotide of genomic. cDNA, semisyn- 
thetic, or synthetic origin which, by virtue of its origin or manipulation: (1) is linked to a polynucleotide other than that to 

30 which it is linked in nature, or (2) does not occur in nature. 

[0039] The term "polynucleotide'' as used herein refers to a polymeric form of nucleotides of any length, either rtoo- 
nudeotides or deoxyribonudeotides. This term refers only to the primary structure of the molecule. Thus, this term 
includes double- and single-stranded DNA, and RNA. It also includes known types of modifications, for example, labels 
which are known in the art methylation, "caps", substitution of one or more of the naturally occurring nucleotides with 

35 an analog, inter nucleotide modifications such as, for example, those with uncharged linkages (e.g.. methyl phospho- 
nates, phosphotriesters, phosphoamidates, carbamates, etc.) and with charged linkages (e.g.. phosphorothioates. 
phosphorodithioates, etc.), those containing pendant moieties, such as, for example proteins (including for e.g., nucle- 
ases, toxins, antibodies, signal peptides, poly-L-lysine, etc.). those with intercalators (e.g., acridine, psoralen, etc.). 
those containing chelators (e.g.. metals, radioactive metals, boron, oxidative metals, etc.). those containing alkytators. 

40 those with modified linkages (e.g., alpha anomeric nucleic acids, eta), as well as unmodified forms of the polynucle- 
otide. 

[0040] "Purified polynucleotide" refers to a composition comprising a specified polynucleotide that is substantially free 
of other components, such composition typically comprising at least about 70% of the specified polynucleotide, more 
typically at least about 80%. 90% or even 95% to 99% of the specified polynucleotide. 
45 [0041] "Purified polypeptide" refers to a composition comprising a specified polypeptide that is substantially free of 
other components, such composition typically comprising at least about 70% of the specified polypeptide, more typi- 
cally at least about 80%, 90% or even 95% to 99% of the specified polypeptide. 

[0042] "Recombinant host cells", "host cells", "ceils", "cell lines", "cell cultures", and other such terms denote micro- 
organisms or higher eukaryotic cell lines cultured as unicellular entities that can be, or have been, used as recipients 

so for a recombinant vector or other transfer DNA. and include the progeny of the original cell which has been transformed. 
It is understood that the progeny of a single parental cell may not necessarily be completely identical in morphology or 
in genomic or total DNA complement as the original parent, due to natural, accidental, or deliberate mutation. 
[0043] A "replicon" is any genetic element e.g.. a plasmid. a chromosome, a virus, a cosmid. eta that behaves as an 
autonomous unit of polynucleotide replication within a cell; i.a, capable of replication under its own control. 

55 [0044] A "cloning vector" is a replicon that can transform a selected host cell and in which another polynucleotide seg- 
ment is attached, so as to bring about the replication and/or expression of the attached segment Typically, cloning vec- 
tors include ptasmids, virus (e.g., bacteriophage vector) and cosmids. 

[0045] An Integrating vector" is a vector that does not behave as a replicon in a selected host cell, but has the ability 
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to integrate into a replicon (typically a chromosome) resident in the selected host to stably transform the host. 

[0046] An "expression vector" is a construct that can transform a selected host cell and provides tor expression of a 
heterologous coding sequence in the selected host Expression vectors can be either a cloning vector or an integrating 
vector. 

5 [0047] A "coding sequence" is a polynucleotide sequence which is transcribed into mRNA and/or translated into a 
polypeptide when placed under the control of appropriate regulatory sequences. The boundaries of the coding 
sequence are determined by a translation start codon at the S'-terminus and a translation stop codon at the 3*-terminus. 
A coding sequence can include, but is not limited to mRNA, cDNA, and recombinant polynucleotide sequences. 
[0048] "Control sequence" refers to polynucleotide regulatory sequences which are necessary to effect the expres- 

10 ston of coding sequences to which they are ligated. The nature of such control sequences differs depending upon the 
host organism. In prokaryotes, control sequences generally include promoter, rfoosomal binding site, and terminators. 
In eukaryotes generally control sequences include promoters, terminators and, in some instances, enhancers. The 
term "control sequences" is intended to include, at a minimum, all components the presence of which are necessary for 
expression, and may also include additional advantageous comp on e nts. 

15 [0049] "Operably linked" refers to a juxtaposition wherein the components so described are in a relationship permit- 
ting them to function in their intended manner. A control sequence "operably linked" to a coding sequence is ligated in 
such a way that expression of the coding sequence is achieved under condtio ns comp a t ib le with the control 
sequences. 

[0050] An "open reading frame" or ORF is a region of a polynucleotide sequence which encodes a polypeptide; this 
20 region may represent a portion of a coding sequence or a total coding sequence. 

[0051] "Immunologically cross-reactive" refers to two or more epitopes or polypeptides that are bound by the same 
ant body. Cross-reactivity can be determined by any of a number of immunoassay techniques, such as a competition 
assay. 

[0052] As used herein, the term "antibody" refers to a polypeptide or group of polypeptides which comprise at least 
25 one epitope. An "antigen binding site" is formed from the folding of the variable domains of an antibody moiecule(&) to - 
form three-cfimenstonaJ binding sites with an internal surface shape and charge distribution complementary to the fea- 
tures of an epitope of an antigen, which allows specific binding to form an antibody-antigen complex. An antigen binding 
site may be formed from a heavy- and/or light-chain domain (VH and VL. respectively), which form hypervariaJbte loops 
which contribute to antigen binding. Trie term "anttoody" includes, without limitation, chimeric antibodies, altered anti- 
30 bodies, univalent antibodies, Fab proteins, and single-domain antibodies. In many cases, the biding phenomena of anti- 
bodies to antigens is equivalent to other tigand/anti-ligand binding. 

[0053] As used herein, a "single domain antibody" (dAb) is an antibody which is comprised of an HL domain, which 
binds specifically with a designated antigen. A dAb does not contain a VL domain, but may contain other antigen bind- 
ing domains known to exist to antfoodies, for example, the kappa and lambda domains. Methods for preparing dAbs are 

35 known in the art. See, for example. Ward et al, Nature 341: 544 (1 989). 

[0054] Antibodies may also be comprised of VH and VL domains, as well as other known antigen binding domains. 
Examples of these types of antibodies and methods for their preparation and known in the art (see. eg., U.S. Patent 
No. 4,816,467, which is incorporated herein by reference), and include the following. For example, "vertebrate antibod- 
ies" refers to antibodies which are tetramers or aggregates thereof, comprising light and heavy chains which are usually 

40 aggregated in a "V configuration and which may or may not have covalent linkages between the chains. In vertebrate 
antibodies, the amino acid sequences of the chains are homologous with those sequences found in antibodies pro- 
duced in vertebrates, whether in situ or in vitro (for example, in hybridomas). Vertebrate antibodies include, for example, 
purified polyclonal antibodies and monoclonal antibodies, methods for the preparation of which are described infra. 
[0055] "Hybrid antibodies" are antibodies where chains are separately homologous with reference to mammalian anti- 

45 body chains and represent novel assemblies of them, so that two different antigens are preciprtabie by the tetramer or 
aggregate. In hybrid antibodies, one pair of heavy and light chains are homologous to those found in an antibody raised 
against a first antigen, while a second pair of chains are homologous to those found in an antibody raised against a sec- 
ond antibody. This results in the property of "drvalence", i.a, the ability to bind two antigens simultaneously. Such 
hybrids may also be formed using chimeric chains, as set forth below. 

so [0056] "Chimeric antfoodies" refers to antibodies in which the heavy and/or light chains are fusion proteins. Typically, 
one portion of the amino acid sequences of the chain is homologous to corresponding sequences in an antibody 
derived from a particular species or a particular class, while the remaining segment of the chain is homologous to the 
sequences derived from another species and/or class. Usually, the variable region of both light and heavy chains mim- 
ics the variable regions or antibodies derived from one species of vertebrates, while the constant portions are homolo- 

55 gous to the sequences in the antibodies derived from another species of vertebrates. However, the definition is not 
limited to this particular example. Also included is any antibody in which either or both of the heavy or light chains are 
composed of combinations of sequences mimicking the sequences in antibodies of different sources, whether these 
sources be from differing classes or different species of origin, and whether or not the fusion point is at the variable/con- 
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stent boundary. Thus, it rs possible to produce antibodies in which neither the constant nor the variable region mimic 
know antibody sequences. It then becomes possible, for example, to construct antibodies whose variable region has a 
higher specific affinity for a particular antigen, or whose constant region can elicit enhanced complement fixation, or to 
make other improvements in properties possessed by a particular constant region. 

5 [0057] Another example is "altered antibodies", which refers to antibodies in which the naturally occurring amino acid 
sequence in a vertebrate antibody has been varies. Utilizing recombinant ONA techniques, antibodies can be rede- 
signed to obtain desired characteristics. The possible variations are many, and range from the changing of one or more 
amino acids to the complete redesign of a region, for example, the constant region. Changes in the constant region, in 
general, to attain desired cellular process characteristics, ag., changes in complement fixation, interaction with mem- 

10 branes, and other effector functions. Changes in the variable region may be made to alter antigen binding characteris- 
tics. The antibody may also be engineered to aid the specific delivery of a molecule or substance to a specific cell or 
tissue srta The desired alterations may be made by known techniques in molecular biology, ag. f recombinant tech- 
niques, site-directed mutagenesis, etc. 

[0058] Yet another example are "univalent antibodies", which are aggregates comprised of a heavy -chain/light-chain 
15 dimer bound to the Fc (i.e.. stem) region of a second heavy chain. This type of antibody escapes antigenic modulation. 
See. e.g., GJermie et al. Nature 223: 712 (1982). Included ateo within the definition of antibodies are "Fab" fragments of 
antibodies. The "Fab" region refers to those portions of the heavy and light chains which are roughly equivalent, or anal- 
ogous, to the sequences which comprise the branch portion of the heavy and light chains, and which have been shown 
to exhibit immunological binding to a specified antigen, but which lack the effector Fc portion. "Fab" includes aggregates 
20 of one heavy and one light chain (commonly known as Fab'), as well as tetramers containing the 2H and 2L chains 
(referred to as Ffab)^, which are capable of selectively reacting with a designated antigen or antigen family. Fab anti- 
bodies may be divided into subsets analogous to those described above, i.a, Vertebrate Fab", "hybrid Fab", "chimeric 
Fab", and "altered Fab". Methods of producing Fab fragments of antibodies are known within the art and include, for 
example, proteolysis, and synthesis by recombinant techniques. 
2S [0059] "Epitope" refers to an antibody binding site usually defined by a polypeptide, but ateo by non-amino acid hap- 
tens. An epitope could comprise 3 amino acids in a spatial, co n formati on which is unique to the epitope, generally an 
epitope consists of at least 5 such amino acids, and more usually, consists of at least 8-10 such amino acids. 
[0060] "Arttigen-antfoody complex" refers to the complex formed by an antibody that is specifically bound to an epitope 
on an antigen. 

30 [0061 ] "Immunogenic polypeptide" refers to a polypeptide that elicits a cellular and/or humoral immune response in a 
mammal, whether alone or linked to a carrier, in the presence or absence of an adjuvant 

[0062] "Polypeptide" refers to a polymer of amino acids and does not refer to a specific length of the molecule Thus, 
peptides, oligopeptides, and proteins are included within the definition of polypeptide This term ateo does not refer to 
or exclude post-expression modifications of the polypeptide, for example, glycosylabons, acetylations, phosphoryia- 
35 tions and the like. Included within the definition are. for example, polypeptides containing one or more analogs of an 
amino acid (including, for example, unnatural amino acids, etc.), polypeptides with substituted linkages, as well as other 
modifications known in the art both naturally occurring and non-natural ry occurring. 

[0063] "Transformation", as used herein, refers to the insertion of an exogenous polynucleotide into a host cell, irre- 
spective of the method used for the insertion, for example, direct uptake, transduction, f -mating or electroporation. The 
40 exogenous polynucleotide may be maintained as a non-integrated vector, for example, a plasmid, or alter natively, may 
be integrated into the host genome 

[0064] A "transformed" host cell refers to both the immediate cell that has undergone transformation and its progeny 
that maintain the originally exogenous polynucleotide. 
[0065] "Treatment" as used herein refers to prophylaxis and/or therapy. 
45 [0066] "Individual", refers to vertebrates, particularly members of the rnamrnalian species, and includes but is not lim- 
ited to domestic animals, sports animals, and primates, including humans. 

[0067] "Sense strand" refers to the strand of a oouble-stranded DNA molecule that is homologous to a mRNA tran- 
script thereof. The "anti-sense strand" contains a sequence which is complementary to that of the "sense strand". 
[0068] "AntibooV-corrtaining body component" refers to a component of an individual's body which is a source of the 
so antibodies of interest Antibody-containing body components are known in the art, and include but are not limited to, 
whole blood and components thereof, plasma, serum, spinal fluid, lymph fluid, the external sections of the respiratory, 
intestinal, and genitourinary tracts, tears, saliva, milk, white blood cells, and myelomas. 

[0069] "Purified HCV" isolate refers to a preparation of HCV particles which has been isolated from the cellular con- 
stituents with which the virus is normally associated, and from other types of viruses which may be present in the 
55 infected tissue The techniques for isolating viruses are known to those of skill in the art and include, for example, cen- 
trifugation and affinity chromatography. 

[0070] An HCV "particle" is an entire virion, as well as particles which are intermediates in virion formation. HCV par- 
ticles generally have one or more HCV proteins associated with the HCV nudeic acid. 
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[0071] "Probe" refers to a polynucleotide which forms a hybrid structure with a sequence in a target polynucleotide, 
due to complementarity of at least one region in the probe with a region in the target 

[0072] "Biological sample" refers to a sample of tissue or fluid isolated from an incfividual, including but not limited to, 
for example, whole blood and components thereof, plasma, serum, spina) fluid, lymph fluid, the external sections of the 

5 skin, respiratory, intestinal, and genitourinary tracts, tears, saliva, milk, blood cells, tumors, organs, and also samples 
of in vitro cell culture constituents (including but not limited to conditioned medium resulting from the growth of cells in 
cell culture medium, putatively vitally infected cells, recombinant cells, and cell components). 
[0073] The invention pertains to the isolation and characterization of a newly discovered isolate of HCV, J1 and J 7, 
their nucleotide sequences, their protein sequences and resulting polynucleotides, polypeptides and antibodies derived 

10 therefrom. Isolates J 1 and J7 are novel in their nucleotide and amino acid sequences, and is believed to characteristic 
of HCV isolates from Japan and other Asian countries. 

[0074] The nucleotide sequences derived from HCV/J1 and HCV/J7 are useful as probes to cfiagnose the presence 
of virus in samples, and to isolate other naturally occurring variants of the virus. These nucleotide sequences also make 
available polypeptide sequences of HCV antigens encoded within the J1 and J7 genome and permits the production of 

is polypeptides which are useful as standards or reagents in diagnostic tests and/or as components of vaccines. Antibod- 
ies, both polyclonal and monoclonal, directed against HCV epitopes contained within these polypeptide sequences are 
also useful tor diagnostic tests, as therapeutic agents, for screening of antiviral agents, and for the isolation of the 
NANBH virus. In addition, by utilizing probes derived from the sequences disclosed herein it is possHe to isolate and 
sequence other portions of the J1 and J7 genome, thus giving rise to additional probes and polypeptides which are use- 

20 ful in the diagnosis and/or treatment both prophylactic and therapeutic, of NANBH. 

[0075] The availability of the HCV/J1 and HCV/J7 nucleotide sequences enable the construction of polynucleotide 
probes and polypeptides useful in diagnosing NANBH due to HCV infection and in screening blood donors as well as 
donated blood and blood products for infection. For example, from the sequences it is possible to synthesize DNA oli- 
gomers of about 8-10 nucleotides, or larger, which are useful as hybrkfization probes to detect the presence of HCV 

25 RNA in, for example, sera of subjects suspected of harboring the virus, or for screening donated blood for the presence 
of the virus The HCV/J1 and HCV/J7 sequences also allow the design and production of HCV specific polypeptides 
which are useful as diagnostic reagents for the presence of antibodies raised during NANBH. Antibodies to purified 
polypeptides derived from the HCV/J1 and HCV/J7 sequences may also be used to detect viral antigens in infected indi- 
viduals and in blood. 

30 [0076] Knowledge of these HCV/J1 and HCV/J7 sequences also enable the design and production of polypeptides 
which may be used as vaccines against HCV and also for the production of antibodies, which in turn may be used for 
protection against the cfisease, and/or for therapy of HCV infected individuals. Moreover, the disclosed HCV/J1 and 
HCV/J7 sequences enable further characterization of the HCV genome. Polynucleotide probes derived from these 
sequences, as well as from the HCV genome, may be used to screen cONA Ibraries for additional viral cDNA 

35 sequences, which, in turn, may be used to obtain additional overlapping sequences. See, e.g.. EPO Pub. No. 318,216. 
[0077] The HCV/J1 and HCV/J7 polynucleotide sequences, the polypeptides derived therefrom and the antibodies 
directed against these polypeptides, are useful in the isolation and identification of the BB-NANBV agent(s). For exam- 
ple, antibodies directed against HCV epitopes contained in polypeptides derived from the HCV/J1 sequences may be 
used in processes based upon affinity chromatography to isolate the virus. Alternatively, the antibodies may be used to 

40 identify viral particles isolated by other techniques. The viral antigens and the genomic material within the isolated viral 
particles may then be further characterized 

[0078] The information obtained from further sequencing of the HCV/J1 and HCV/J7 genome, as well as from further 
characterization of the HCV/J1 and HCV/J7 antigens and characterization of the genomes enable the design and syn- 
thesis of additional probes and polypeptides and antibodies which may be used for diagnosis, for prevention, and for 

45 therapy of HCV induced NANBH, and for screening for infected blood and blood-related products. 

[0079] The availability of HCV/J1 and HCV/J7 cDNA sequences permits the construction of expression vectors 
encocfing antigenically active regions of the polypeptide encoded in either strand. These antigenically active regions 
may be derived from coat or envelope antigens or from core antigens, or from antigens which are non-structural includ- 
ing, for example, polynucleotide binding proteins, polynucleotide polymerase^), and other viral proteins required for the 

so replication and/or assembly of the virus partida Fragments encoding the desired polypeptides are derived from the 
cDNA clones using conventional restriction digestion or by synthetic methods, and are ligated into vectors which may, 
for example, contain portions of fusion sequences such as beta-galactosidase or superoxide dismutase (SOD). Meth- 
ods and vectors which are useful for the production of polypeptides which contain fusion sequences of SOD are 
described in EPO Pub. No. 196,05a Vectors encoding fusion polypeptides of SOD and HCV polypeptides are 

55 described in EPO Pub. No. 318.216. Any desired portion of the HCV cDNA containing an open reading frame, in either 
sense strand, can be obtained as a recombinant polypeptide, such as a mature or fusion protein. Alternatively, a 
polypeptide encoded in the cDNA can be provided by chemical synthesis 

[0080] The DNA encoding the desired polypeptide, whether in fused or mature form, and whether a not containing a 
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signal sequence to permit secretion, may be ligated into expression vectors suitable for any convenient host. Both 
eukaryotic and prokaryotic host systems are presently used in forming recombinant polypeptides, and a summary of 
some of the more common control systems and host cell is given below. The polypeptide produced in such host cells 
is then isolated from lysed cells or from the culture medium and purified to the extent needed for its intended use. Puri- 
5 fication may be by techniques known in the art, for example, differential extraction, salt fractionation, chromatography 
on ion exchange resins, affinity chromatography, centrifugation, and the like. See, for example. Methods in Snzvmologv 
for a variety of methods for purifying proteins. 

[0081 ] Such recombinant or synthetic HC V polypeptides can be used as diagnostics, or those which give rise to neu- 
tralizing antfoodies may be formulated into vaccines. Antfoodies raised against these polypeptides can also be used as 
10 diagnostics, or for passive immunotherapy. In addition, antibodies to these polypeptides are useful for isolating and 
identifying HCV particles. 

[0062] The HCV antigens may also be isolated from HCV virions. The virions may be grown in HCV infected ceils in 
tissue culture, a in an infected host 

[0063] While the polypeptides of the present invention may comprise a substantially complete viral domain, in many 

is applications ail that is required is that the polypeptide comprise an antigenic or immunogenic region of the virus. An 
antigenic region of a polypeptide is generally relatively smalMypically 8 to 10 amino acids or less in length. Fragments 
of as few as 5 amino acids may characterize an antigenic region. These segments may correspond to regions of 
HCV7J1 or HCV/J& epitopes. Accordingly, using the cDNAs of HCVAJ1 and HCV/J7 as a basts, DNAs encoding short 
segments of HCV/J1 and HCV/J7 polypeptides can be expressed recombinant^ either as fusion proteins, or as isolated 

20 polypeptides. In addition, short amino acid sequences can be conveniently obtained by chemical synthesis. 

[0084] In instances wherein the synthesized polypeptide is correctly configured so as to provide the correct epitope, 
but is too small to be immunogenic the polypeptide may be linked to a suitable carrier. A number of techniques for 
obtaining such linkage are known in the art including the formation of disulfide linkages using N-6uccinimidyl-3-(2-pyri- 
dyl-thio)propk>nate (SPDP) and succmimidyl 4^N-rr^eimido-metriy!}<^(^exane-1 -carboxyiate (SMCC) obtained 

25 from Pierce Company, Rockford, Illinois, (if the peptide lacks a sutfhydryl group, this can be provided by addition of a 
cysteine residue.) These reagents create a disulfide linkage between themselves and peptide cysteine residues on one 
protein and an amide iintage through the epsilon-amino on a lysine, or other free amino group in the other. A variety of 
such disutftie/amide-torming agents are foown. See, for example, tmmun. Rev. (1982) £2:185. Other bifunctkmal cou- 
pling agents form a thioether rather than a disulfide linkage. Many of these thio-ether-forming agents are commercially 

30 available and include reactive esters of 6-maleimidocaprdc acid, 2-bromoacetic add. 2-iodoacetic acid, 4-(N-maleim- 
ido-methyl)cydohexane-1 -carbaxylic acid, and the like. The carboocyt groups can be activated by combining them with 
sucdrrirnide or 14rydroxyl-2-nitro-4-sulfonic acid, sodium salt Additional methods of coupling antigens employs the 
rotavirusfbincfing peptide" system described in EPO Pub. No. 259,149, the disclosure of which is incorporated herein 
by referenca The foregoing list is not meant to be exhaustive, and modifications of the named compounds can dearly 

35 be used. 

[0085] Any carrier may be used which does not itself induce the production of antibodies harmful to the host Suitable 
carriers are typically large, slowly metabolized macromolecules such as proteins; polysaccharides, such as latex func- 
tionalized sepharose, agarose, cellulose, cellulose beads and the like; polymeric amino acids, such as polyglutamic 
acid, potylysine, and the like; amino acid copolymers; and inactive virus partides. Especially useful protein substrates 
40 are serum albumins, keyhole limpet hemocyanin, immunoglobulin molecules, thyrogtobulin, ovalbumin, tetanus toxoid, 
and other proteins well foown to those skilled in the art. 

[0088] In addrtion to full-length viral proteins, polypeptides comprising truncated HCV amino acid sequences encod- 
ing at least one viral epitope are useful immunological reagents. For example, polypeptides comprising such truncated 
sequences can be used as reagents in an immunoassay. These polypeptides also are candidate subunit antigens in 

45 compositions for antiserum production or vacdnes. While these truncated sequences can be produced by various 
known treatments of native viral protein, it is generally preferred to make synthetic or recombinant polypeptides com- 
prising an HCV sequence. Pdypeptides comprising these truncated HCV sequences can be made up entirety of HCV 
sequences (one a more epitopes, either contiguous or noncontiguous), or HCV sequences and heterologous 
sequences in a fusion protein. Useful heterologous sequences indude sequences that provide for secretion from a 

so recombinant host, enhance the immunological reactivity of the HCV eprtope(s), or fedlitate the coupling of the polypep- 
tide to an immunoassay support or a vaccine carrier. See, e.g., EPO Pub. No. 1 16.201 ; U.S. Pat. No. 4,722,840; EPO 
Pub. No. 259,149; U.S. Pat No. 4,629.783, the dtsdosures of which are incorporated herein by referenca 
[0087] The size of polypeptides comprising the truncated HCV sequences can vary widely, the minimum size being 
a sequence of suffident size to preside an HCV epitope, while the maximum size is not critical. In some applications, 

55 the maximum size usually is not substantially greater than that required to provide the desired HCV epitopes and func- 
tion (s) of the heterologous sequence, if any. Typically, the truncated HCV amino add sequence will range from about 5 
to about 100 amino acids in length. More typically, however, the HCV sequence will be a maximum of about 50 amino 
acids in length, preferably a maximum of about 30 amino adds. It is usually desirable to select HCV sequences of at 
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least about 10. 12 or 15 amino acids, up to a maximum of about 20 or 25 amino acids. 

[0088] Truncated HCV amino acid sequences comprising epitopes can be identified in a number of ways. For exam- 
ple, the entire viral protein sequence can be screened by preparing a series of short peptides that together span the 
entire protein sequenca By starting with, tor example, 100-mer polypeptides, it would be routine to test each pofypep- 

5 tide for the presence of eprtope(s) showing a desired reactivity, and then testing progressively smaller and overlapping 
fragments from an identified 100-mer to map the epitope of interest Screening such peptides in an immunoassay is 
within the skill of the art It is also known to carry out a computer analysis of a protein sequence to identify potential 
epitopes, and then prepare oligopeptides comprising the identified regions for screening. It is appreciated by those of 
skill in the art that such computer analysis of antigenicity does not always identify an epitope that actually exists, and 

io can also incorrectly identify a region of the protein as containing an epitope. 

[0089] The observed relationship of the putative polyproteins of HCV and the Raviviruses allows a prediction of the 
putative domains of the HCV "non-structural" (NS) proteins. The locations of the individual NS proteins in the putative 
Ravivirus precursor polyprotein are fairly well-known. Moreover, these also coincide with observed gross fluctuations in 
the hydrophobicity profile of the polyprotein. It is established that NS5 of Raviviruses encodes the virion polymerase, 

is and that NS1 corresponds with a complement fixation antigen which has been shown to be an effective vaccine in ani- 
mals. Recently, it has been shewn that a f laviviral protease function resides in NS3. Due to the observed similarities 
between HCV and the Raviviruses, deductions concerning the approximate locations of the corresponding protein 
domains and functions in the HCV polyprotein are possibla Rgure 1 1 is a schematic of putative domains of the HCV 
polyprotein. The expression of polypeptides containing these domains in a variety of recombinant host cells, including. 

20 fa example, bacteria, yeast insect and vertebrate cells, should give rise to important immunologicaJ reagents which 
can be used for diagnosis, detection, and vaccines. 

[0090] Although the non-structural protein region of the putative polyproteins of the HCV isolate described herein and 
of Raviviruses appears to be generally similar, there is less similarity between the putative structural regions which are 
towards the N-terminus. In this region, there is a greater divergence in sequence, and in addrtion. the hydrophobic pro- 

25 file of the two regions show less similarity. This "divergence" begins in the N-termmai region of the putative NS1 domain 
in HCV. and extends to the presumed N-terminus. Nevertheless, it is still possible to predict the approximate locations 
of the putative rtudeocapsid (N-terminal basic domain) and E (generally hydrophobic) domains within the HCV polypro- 
tein. From these predictions it may be possible to identify approximate regions of the HCV polyprotein that could corre- 
spond with useful immunological reagents. For example, the E and NS1 proteins of Raviviruses are known to have 

30 efficacy as protective vaccines. These regions, as well as some which are shown to be antigenic in the HCV1 . for exam- 
ple those within putative NS3, C, and NS5. eta, should also provide diagnostic reagents. 

[0091] The immunogenicity of the HCV sequences may also be enhanced by preparing the sequences fused to or 
assembled with partide-forming proteins such as. for example, hepatitis B surface antigen or rotavirus VP6 antigen. 
Constructs wherein the HCV epitope is linked directly to the particle-forming protein coding sequences produce hybrids 

35 which are immunogenic with respect to the HCV epitope. In addition, all of the vectors prepared include epitopes spe- 
cific to HBV, having various degrees of immunogenicity. such as. for example, the pre-S peptida Thus, particles con- 
structed from particle forming protein which include HCV sequences are immunogenic with respect to HCV and 
particle-form protein. See, e.g.. US. Pat. No. 4.722,840; EPO Pub Na 175.261 ; EPO Pub. Na 299.149; Michelle et al. 
(1984) Int. Symposium on Viral Hepatitis. 

40 [0092] Vaccines may be prepared from one or more immunogenic polypeptides derived from HCV/J1 or HCVAJ7. The 
observed homology between HCV and Raviviruses provides information concerning the polypeptides which are likely 
to be most effective as vaccines, as well as the regions of the genome in which they are encoded. The general structure 
of the Ravivirus genome is discussed in Rice et al. (1986) in THE VIRUSES: THE TOGAVIRIDAE AND FLAVIVIRIDAE 
(Series eds. Fraenkel-Conrat and Wagner. Vol eds. Schlesinger and Schlesinger, Plenum Press). The flavivirus 

45 genomic RNA is believed to be the only virus-specific mRNA species, and it is translated into the three viral structural 
proteins, i.e.. C. M. and E. as well as two large nonstructural proteins. NV4 and NV5, and a complex set of smaller non- 
structural proteins. It is known that major neutralizing epitopes for Raviviruses reside in the E (envelope) protein. Roe- 
hrig (1986) in THE VIRUSES: THE TOGAVIRIDAE AND FLAVIVIRIDAE (Series eds. Fraenkel-Conrat and Wagner, Vol 
eds. Schlesinger and Schlesinger, Plenum Press). The corresponding HCV E gene and polypeptide encoding region 

so may be predicted, based upon the homology to Raviviruses. Thus, vaccines may be comprised of recombinant 
polypeptides containing epitopes of HCV E. These polypeptides may be expressed in bacteria, yeast or mammalian 
cells, or alternatively may be isolated from viral preparations. It is also anticipated that the other structural proteins may 
also contain epitopes which give rise to protective anti-HCV antfcodies. Thus, polypeptides containing the epitopes of 
E. C, and M may also be used, whether singly or in combination, in HCV vaccines. 

55 [0093] In addition to the above, it has been shown that immunization with NS1 (nonstructural protein 1 ), results in pro- 
tection against yellow fever. Schlesinger et al (1986) J. Virol. £Q:1 153. This is true even though the irnrrtunization does 
not give rise to neutralizing arrttxxfies. Thus, particularly since this protein appears to be highly conserved among Ra- 
viviruses, it is likely that HCV NS1 will also be protective against HCV infection. Moreover, it also shows that nonstruc- 
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turaJ proteins may provide protection against viral pathogenicity, even if they do not cause the production of neutralizing 
antibodies. 

[0094] In view of the above, multivalent vaccines against HCV may be comprised of one or more epitopes from one 
or more structural proteins, and/or one or more epitopes from one or more nonstructural proteins. These vaccines may 

5 be comprised of. for example, recombinant HCV polypeptides and/or polypeptides isolated from the virions. In particu- 
lar, vaccines are contemplated comprising one or more of the following HCV proteins, or subunit antigens derived there- 
from: E, NS1, C. NS2, NS3, NS4 and NS5. Particularly preferred are vaccines comprising E and/or NS1, or subunits 
thereof. In addition, it may be possible to use inactivated HCV in vaccines; i reactivation may be by the preparation of viral 
lysates. or by other means known in the art to cause inactivation of Ravivi ruses, for example, treatment with organic 

w solvents or detergents, or treatment with formalin. Moreover, vaccines may also be prepared from attenuated HCV 
strains or from hybrid viruses such as vaccinia vectors known in the art [Brown et al. Nature 319 : 549-550 (1 986)]. 
[0095] The preparation of vaccines which contain immunogenic polypeptide(s) as active ingredients is known to one 
skilled in the art Typically, such vaccines are prepared as injectables, either as liquid solutions or suspensions; solid 
forms suitable for solution in, or suspension in, liquid prior to injection may also be prepared. The preparation may also 

is be emulsified, or the protein encapsulated in liposomes. The active immunogenic ingredients are often mixed with 
excipients which are pharmaceutical^ acceptable and compatible with the active ingredient Suitable excipients are, for 
example, water, saline, dextrose, glycerol, ethanof, or the like and combinations thereof. In addition, if desired, the vac- 
cine may contain minor amounts of auxiliary substances such as wetting or emulsifying agents, pH buffering agents, 
and/or adjuvants which enhance the effectiveness of the vaccine. Examples of adjuvants which may be effective include 

20 but are not limited to: aluminum hydroxide, N-acetyl-rrwramyl-L-threonyl-D-isoglutamine (thr-MDP), N-acetyl-nor- 
muramyl-L-alanyl-D-isoglutamine (CGP 1 1637, referred to as nor-MDP), N-acetytmuramyl-L-alany^D-isogtutaminyl-L- 
atanine-2*(V-2 t Hiipalmitoyl-sn-grycerol-3-hydroxv^ (CGP 19835 A, referred to as MTP-PE), 

and RIBI, which contains three components extracted from bacteria, monophosphoryt lipid A, trehalose dimycotate and 
cell wall skeleton (MPL+TDM+CWS) in a 2% squalen e/Tween 80 emulsion. The effectiveness of an adjuvant may be 

25 determined by measuring the amount of antibodies directed against an immunogenic polypeptide containing an HCV 
antigenic sequence resulting from administration of this polypeptide in vaccines which are also comprised of the various 
adjuvants. 

[0096] The vaccines are conventionally administered parenteral!* by injection, usually, either subcutaneousiy or intra- 
muscularly. Additional formulations which are suitable for other modes of administration include suppositories and, in 

30 some cases, oral formulations. For suppositories, traditional binders and carriers may include, for example, polyaikylene 
glycols or triglycerides; such suppositories may be formed from mixtures containing the active ingredient in the range 
of 0.5% to 10%, preferably 1%-2% Oral formulations include such normally employed excipients as, for example, phar- 
maceutical grades of mannitol. lactose, starch, magnesium stearate, sodium saccharine, cellulose, magnesium carbon- 
ate, and the like. These compositions take the form of solutions, suspensions, tablets, pills, capsules, sustained release 

35 formulations or powders and contain 1 0%-95% of active ingredient preferably 25%-70%. 

[0097] The proteins may be formulated into the vaccine as neutral or salt forms. Pharmaceutically acceptable salts 
include the acid addition salts (formed with free amino groups of the peptide) and which are formed with inorganic acids 
such as, for example, hydrochloric or phosphoric acids, or such organic acids such as acetic, oxalic, tartaric maleic, 
and the like. Salts formed with the free carboxyl groups may also be derived from inorganic bases such as, for example, 

40 sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases as isopropylamine, trimethyt- 
amine, 2-ethylamino ethanol. histidine, procaine, and the like. 

[0098] The vaccines are administered in a manner compatHe with the dosage formulation, and in such amount as 
will be prophyiacticalty and/br therapeutically effective. The quantity to be administered, which is generally in the range 
of 5 micrograms to 250 micrograms of antigen per dose, depends on the subject to be treated, capacity of the subjects 

45 immune system to synthesize antfoodies, and the degree of protection desired. Precise amounts of active ingredient 
required to be administered may depend on the judgment of the practitioner and may be peculiar to each subject 
[0099] The vaccine may be given in a single dose schedule, or preferably in a multiple dose schedula A multiple dose 
schedule is one in which a primary course of vaccination may be with 1-10 separate doses, followed by other doses 
given at subsequent time intervals required to maintain and or reenforce the immune response, for example, at 1-4 

so months for a second dose, and if needed, a subsequent dose(s) after several months. The dosage regimen will also, at 
least in part be determined by the need of the intfviduai and be dependent upon the judgment of the practitioner. 
[01 00] In addition, the vaccine containing the immunogenic HCV antigen(s) may be administered in conjunction with 
other immunoregulatory agents, for example, immune globulins. 

[01 01 ] The immunogenic polypeptides prepared as described above are used to produce anttoodies, both polyclonal 
55 and monoclonal. If polyclonal antibodies are desired, a selected mammal (e.g.. mouse, rabbit, goat, horse, eta) is 
immunized with an immunogenic polypeptide bearing an HCV eprtope(s). Serum from the immunized animal is col- 
lected and treated according to known procedures. If serum containing polyclonal antibodies to an HCV epitope con- 
tains antibodies to other antigens, the polyclonal antfoodies can be purified by immunoaffinity chromatography. 
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Techniques for producing and processing polyclonal antisera are known in the art, see for example, Mayer and Walker, 
eds. (1987) IMMUNOCHEMICAL METHODS IN CELL AND MOLECULAR BIOLOGY (Academic Press. London). 
[0102] Monoclonal antibodies directed against HCV epitopes can also be readily produced by one skilled in the art 
The general methodology for making monoclonal antfoodies by hybrid omas is well known. Immortal antibody-producing 

5 cell lines can be created by cell fusion, and also by other techniques such as direct transformation of B lymphocytes 
with oncogenic DNA, or transfection with Epstein- Barr virus. See, e.g.. M. Schreier et al. (1980) HYBRIDOMA TECH- 
NIQUES; Hammerting el al. (1981). MONOCLONAL ANTIBODIES AND T-CELL HYBRIDOMAS; Kennett et al. (1980) 
MONOCLONAL ANTIBODIES; gfifl a&2> U.& Patent Nos. 4.341.761; 4.399.121; 4.427,783; 4.444.887; 4.466.917; 
4.472.500; 4,491,632; and 4.493,890. Panels of monoclonal antibodies produced against HCV epitopes can be 

io screened for various properties; i.a. for isotype, epitope affinity, eta 

[0103] Antibodies, both monoclonal and polyclonal, which are directed against HCV epitopes are particularly useful 
in diagnosis, and those which are neutralizing are useful in passive irnmunotherapy. Monoclonal antibodies, in particu- 
lar, may be used to raise antiidiotype antibodies. 

[0104] Anti-idiotype antfoodies are immunoglobulins which carry an Internal image" of the antigen of the infectious 
15 agent against which protection is desired. Techniques for raising antiidiotype antfoocfies are known in the art See, e,g„ 
Grzych (1985), Nature 316:74: MacNamara et al. (1984). Science 226:1325. Uytdehaag et al (1985). J. Immunol. 
134:1225. These antiidiotype antfoodies may also be useful for treatment and/or diagnosis of NANBH, as well as for 
an elucidation of the immunogenic regions of HCV antigens. 

[0105] Using the HCV/J1 or HCV/J7 polynucleotide sequences as a basis, oligomers of approximately 8 nucleotides 
20 or more can be prepared, either by excision or synthetically, which hybridize with the HCV genome and are useful in 
identification of the viral agent(s). further characterization of the viral genome(s), as well as in detection of the virus(es) 
in diseased individuals. The probes for HCV polynucleotides (natural or derived) are a length which allows the detection 
of unique viral sequences by hybridization. While 6-8 nucleotides may be a workable length, sequences of about 10-12 
nucleotides are preferred, and about 20 nucleotides appears optimal. These probes can be prepared using routine 
25 methods, including automated oligonucleotide synthetic methods. Among useful probes, tor example, are the dones - 
disclosed herein, as well as the various oligomers useful in probing cDNA libraries, set forth below. A complement to 
any unique portion of the HCV genome will be satisfactory. For use as probes, complete complementarity is desirable, 
though it may be unnecessary as the length of the fragment is increased. 

[0106] For use of such probes as diagnostics, the biological sample to be analyzed, such as blood or serum, may be 

30 treated, if desired, to extract the nucleic acids contained therein. The resulting nucleic acid from the sample may be sub- 
jected to gel electrophoresis or other size separation techniques; alternatively, the nucleic acid sample may be dot blot- 
ted without size separation. The probes are then labeled. Suitable labels, and methods for labeling probes are known 
in the art, and include, for example, radioactive labels incorporated by nick translation or kinasing, bfotin, fluorescent 
probes, and chemiluminescent probes. The nucleic acids extracted from the sample are then treated with the labeled 

35 probe under hybridization conditions of suitable stringencies. Usually high stringency conditions are desirable in order 
to prevent false positives. The stringency of hybridization is determined by a number of factors during hybridization and 
during the washing procedure, including temperature, ionic strength, length of time, and concentration of formamide. 
These factors are outlined in. for example. Maniatis, T. (1982) MOLECULAR CLONING; A LABORATORY MANUAL 
(Cold Spring Harbor Press, Cold Spring Harbor, N.Y). 

40 [0107] Generally, it is expected that the HCV genome sequences will be present in serum of infected individuals at 
relatively lew levels, i.e., at approximately lO^-IO 3 chimp infectious doses (CID) per ml. This level may require that 
amplification techniques be used in hybridization assays. Such techniques are known in the art For example, the Enzo 
Biochemical Corporation "Bio-Bridge* system uses terminal deaxynucleotide transferase to add unmodified 3'-poly-dT- 
tails to a DNA probe. The poly dT-tailed probe is hybridized to the target nucleotide sequence, and then to a btotin-mod- 

45 ified poly-A. PCT App. No. 84/03520 and EPO Pub. Na 124.221 describe a DNA hybridization assay in which: (1) ana- 
lyte is annealed to a single-stranded DNA probe that is complementary to an enzyme-labeled oligonucleotide; and (2) 
the resulting tailed duplex is hybridized to an enzyme-labeled oligonucleotida EPO Putx No. 204,510 describes a DNA 
hybridization assay in which anaiyte DNA is contacted with a probe that has a tail, such as a poly-dT tail, an amplifier 
strand that has a sequence that hybridizes to the tail of the probe, such as a poly-A sequence, and which is capable of 

so binding a plurality of labeled strands. 

[01 08] A particularly desirable technique may first involve amplification of the target HCV sequences in sera approx- 
imately 10,000-fold, i.a, to approximately 10 6 sequences/ml. This may be accomplished, tor example, by the polymer- 
ase chain reactions (PCR) technique described which is by Saiki et al. (1986) Nature £2:163. Mullis. U.S. Patent No. 
4,683, 1 95, and Mullis et al. U.S. Patent No. 4.683,202. The amplified sequences) may then be detected using a hybrid- 

55 ization assay which is descrfoed in co-pending European Publication Na 317-077 and Japanese application Na 63- 
260347. which are assigned to the herein assignee, and are hereby incorporated herein by reference. These hybridiza- 
tion assays, which should detect sequences at the level of 10 6 Anl, utilize nucleic acid muttimers which bind to single- 
stranded anaiyte nucleic acid, and which also bind to a multiplicity of single-stranded labeled oligonucleotides. A surta- 
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ble solution phase sandwich assay which may be used with labeled polynucleotide probes, and the methods fa the 
preparation of probes is described in EPO Pub. No. 225,807 which is hereby incorporated herein by reference. 
[0109] The probes can be packaged into diagnostic kits. Diagnostic kits include the probe ONA, which may be labeled; 
alternatively, the probe ONA may be unlabeled and the ingredients for labeling may be included in the kit in separate 
containers. The kit may also contain other suitably packaged reagents and materials needed for the particular hybridi- 
zation protocol, for example, standards, wash buffers, as well as instructions for conducting the test 
[01 1 0] Both the HCV/J1 or HCV/J7 polypeptides which react immunologically with serum containing HCV antibodies 
and the antibodies raised against the HCV specific epitopes in these polypeptides are useful in immunoassays to detect 
presence of HCV antibodies, or the presence of the virus and/or viral antigens, in biological samples. Design of the 
immunoassays is subject to a great deal of variation, and a variety of these are known in the art An immunoassay for 
anti-HCV antibody may utilize one viral epitope or several viral epitopes. When multiple epitopes are used, the epitopes 
may be derived from the same or different viral polypeptides, and may be in separate recornbinant or natural polypep- 
tides, or together in the same recombinant polypeptides. 

[0111] An immunoassay for viral antigen may use, for example, a monoclonal antibody directed towards a viral 
epitope, a combination of monoclonal antibodies directed towards epitopes of one viral polypeptide, monoclonal anti- 
bodies directed towards epitopes of different viral polypeptides, polyclonal antibodies directed towards the same viral 
antigen, polyclonal antibodies directed towards different viral antigens or a combination of monoclonal and polyclonal 
antibodies. 

[0112] Immunoassay protocols may be based, for example, upon competition, or direct reaction, or sandwich type 
assays, Protocols may also, for example, use solid supports, or may be by inrmjnopreciprtation. Most assays involve 
the use of labeled antibody or polypeptide. The labels may be, for example, fluorescent chemHuminescent, radioactive, 
or dye molecules. Assays which amplify the signals from the probe are also known. Examples of which are assays 
which utilize biotin and avidin, and enzyme-labeled and mediated inrtmunoassays, such as EUSA assays. 
[01 1 3] Typically, an immunoassay for anti-HCV antibody will involve selecting and preparing the test sample, such as 
a biological sample, and then incubating it with an antigenic (i.e., epttope-containing) HCV polypeptide under conoltions 
that allow antigen-antbody complexes to form. Such conditions are well known in the art In a heterogeneous format, 
the polypeptide is bound to a solid support to facilitate separation of the sample from the polypeptide after incubation. 
Examples of solid supports that can be used are nitrocellulose, in rnembrane or microliter weQ form, polyvtnylchloride, 
in sheets or microliter wells, polystyrene latex in beads or microliter plates, polyvinyiidine fluoride, known as Intmobu- 
lon™. diazotized paper, nylon membranes, activated beads, and Protein A beads. Most preferably, the Dynatech, Irrtmu- 
lon™ 1 microtiter plate or the 0 .25-inch polystyrene beads, which Spec finished by Precision Plastic Ball, are used in 
the heterogeneous format The solid support is typically washed after separating it from the test sample. In a homoge- 
neous format the test sample is incubated with antigen in solution, under conditions that will precipitate any antigen- 
antibody complexes that are formed, as is know in the art The precipitated complexes are then separated from the test 
sample, tor example, by centrifugation. The complexes formed comprising anti-HCV anttoody are then detected by any 
of a number of techniques. Depending on the format the complexes can be detected with labeled anti-xenogeneic tg 
or, if a competitive format is used, by measuring the amount of bound, labeled competing antibody. 
[0114] In immunoassays where HCV polypeptides are the analyte, the test sample, typically a biological sample, is 
incubated with anti-HCV antibodies again under conditions that allow the formation of antigen-antfcody complexes. Var- 
ious formats can be employed, such as a "sandwich" assay where anttoody bound to a solid support is incubated with 
the test sample; washed ;incubated with a second, labeled antibody to the analyte; and the support is washed again. 
Anatyte is detected by determining if the second antibody is bound to the support tn a competitive format which can 
be either heterogeneous or homogeneous, a test sample is usually incubated with and antibody and a labeled, compet- 
ing antigen either sequentially or simultaneously. These and other formats are well known in the art. 
[01 1 51 The Ravivirus model for HCV allows predictions regarding the likely location of diagnostic epitopes for the vir- 
ion structural proteins. The C. pre-M, M, and E domains are all likely to contain epitopes of significant potential for 
detecting viral antigens, and particularly for diagnosis. Similarly, domains of the nonstructural proteins are expected to 
contain important diagnostic epitopes (e.g.. NS5 encoring a putative polymerase: and NS1 encoding a putative com- 
plement-binding antigen). Recombinant polypeptides, or viral polypeptides, which include epitopes from these specific 
domains may be useful for the detection of viral anttoodies in infections blood donors and infected patients. In addition, 
antibodies directed against the E and/or M proteins can be used in immunoassays for the detection of viral antigens in 
patients with HCV caused NANBH, and in infectious blood donors. Moreover, these antibodies may be extremely useful 
in detecting acute-phase donors and patients. 

[01 1 6] Antigenic regions of the putative poJyprotein can be mapped and identified by screening the antigenicity of bac- 
terial expression products of HCV cDNAs which encode portions of the polyproteia Other antigenic regions of HCV 
may be detected by expressing the portions of the HCV cDNAs in other expression systems, including yeast systems 
and cellular systems derived from insects and vertebrates. In addition, studies giving rise to an antigenicity index and 
riydrophobtaty>riydroprulicity profile give rise to information concerning the probability of a region's antigenicity. Efficient 
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detection systems may include the use of panels of epitopes. The epitopes in the panel may be constructed into one or 
multiple polypeptides. 

[01 1 7] Kits suitable for immunodiagnosis and containing the appropriate labeled reagents are constructed by pack- 
aging the appropriate materials, including the polypeptides of the invention containing HCV epitopes or antibodies 
directed against HCV epitopes in suitable containers, along with the remaining reagents and materials required for the 
conduct of the assay (e.g., wash buffers, detection means like labeled anti-human Ig, labeled anti-HCV, or labeled HCV 
antigen), as well as a suitable set of assay instructions. 

[01 18] The HCV/J 1 and HCV/J7 nucleotide sequence information descrfoed herein may be used to gain further infor- 
mation on the sequence of the HCV genomes, and for identification and isolation of additional HCV isolates related to 
J1 or J7. This information, in turn, can lead to additional polynucleotide probes, polypeptides derived from the HCV 
genome, and antibodies directed against HCV epitopes which would be useful for the diagnosis and/or treatment of 
HCV caused NANBH. 

[01 1 9] The HCVAJ 1 and HCV/J7 nucleotide sequence information herein is useful for the design of probes for the iso- 
lation of additional sequences which are derived from as yet undefined regions of the HCV genomes from which the J1 
and J7 sequences are derived For example, labeled probes containing a sequence of approximately 8 or more nucle- 
otides, and preferably 20 or more nucleotides, which are derived from regions close to the 5'-termini or 3'-termini of the 
family of HCV cDNA sequences disclosed in the examples may be used to isolate overlapping cDNA sequences from 
HCV cDNA libraries. These sequences which overlap the cONAs in the above-mentioned clones, but which also contain 
sequences derived from regions of the genome from which the cDNA in the above mentioned clones are not derived, 
may then be used to synthesize probes for identification of other overlapping fragments which do not necessarily over- 
lap the cDNAs described below. Methods for constructing cDNA Ifcraries are Known in the art See, e.g. EPO Pub. No. 
318,216. It is particularly preferred to prepare libraries from the serum of Japanese and other Asian patients diagnosed 
as having NANBH demonstrating antibody to HCV1 antigens; these are believed to be the most likely candidates for 
carriers of HCV/J 1 , HCV/J7, or related isolates. 

[0120] HCV particles may be isolated from the sera from individuals with NANBH or from cell cultures by any of the 
methods known in the art, including for example, techniques based on size Discrimination such as sedimentation or 
exclusion methods, or techniques based on density such as ultracentrifugation in density gradients, or precipitation with 
agents such as polyethylene glycol, or chromatography on a variety of materials such as anionic or cationic exchange 
materials, and materials which bind due to hydrophobicity. 

[0121] A preferred method of isolating HCV particles or antigen is by immunoaffinity columns. Techniques for immu- 
noaffinrty chromatography are known in the art including techniques for affixing antfcodies to solid supports so that they 
retain their immunosetective activity. The techniques may be those in which the antibodies are adsorbed to the support 
(see, for example, Kurstak in ENZYME IMMUNODIAGNOSIS, page 31 -37). as well as those in which the antibodies are 
covalently linked to the support Generally, the techniques are similar to those used tor covalent linking of antigens to a 
solid support described above. However, spacer groups may be included in the bifunctional coupling agents so that the 
antigen binding site of the antfoody remains accessibia The antibodies may be monoclonal, or polyclonal, and it may 
be desirable to purify the antibodies before their use in the immunoassay. 

[0122] The general techniques used in extracting the genome from a virus, preparing and probing a cDNA Itorary. 
sequencing clones, constructing expression vectors, transforming cells, performing immunological assays such as radi- 
oimmunoassays and EUSA assays, for growing cells in culture, and the tike are known in the art and laboratory manu- 
als are available deserting these techniques. However, as a general guide, the following sets forth some sources 
currently available for such procedures, and for materials useful in carrying them out 

[0123] Both prokaryotic and eukaryotic host cells may be used for expression of desired coding sequences when 
appropriate control sequences which are compatfofe with the designated host are used. Among prokaryotic hosts, EL 
GOli is most frequently used. Expression control sequences tor prokaryotes include promoters, optionally containing 
operator portions, and ribosome binding sites. Transfer vectors compatible with prokaryotic hosts are comrnonfy derived 
from, for example, pBR322, a plasrnid containing operons conferring ampiciitin and tetracycline resistance, and the var- 
ious pUC vectors, which also contain sequences conferring antibiotic resistance markers. These markers may be used 
to obtain successful transformants by selection. Commonly used prokaryotic control sequences include the Beta-lacta- 
mase (penicillinase) and lactose promoter systems (Chang et al.(1977), Nature 133:1056. the tryptophan (trp) pro- 
moter system (Goeddel et al. (1980) Nucleic Acid Res. g:4057), and the lambda-derived P L promoter and N gene 
ribosome binding site (Shimatake et al. (1981) Nature 292:1 28) and the hybrid lac promoter (De Boer et al. (1983) Proc. 
Natl. Acad. Sci. USA 222:128) derived from sequences of the tip and lae UV5 promoters. The foregoing systems are 
particularly compatible with £. flpJi; if desired, other prokaryotic hosts such as strains of Bacillus or Pseudomonas may 
be used, with corresponding control sequences. 

[01 24] Eukaryotic hosts include yeast and mammalian cells in culture systems. Saccharomvees cerevisiae. Saccha- 
romyces cartsfrerqepsjs, Klebsiela lactis and Ptchia pastoris are the most commonly used yeast hosts, and are conven- 
ient fungal hosts. Yeast compatible vectors carry markers which permit selection of successful tra ns for man t s by 
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conferring prototrophy to auxotrophic mutants or resistance to heavy metals on wild-type strains. Yeast compatible vec- 
tors may employ the 2 micron origin of replication (Broach et al. (1983) Math Enz. 101 307). the combination of CEN3 
and ARS1 or other means for assuring replication, such as sequences which will result in incorporation of an appropri- 
ate fragment into the host cell genoma Control sequences for yeast vectors are known in the art and include promoters 
for the synthesis of glycolytic enzymes (Hess et al. (1968) J. Adv. Enzyme Eng. 7:149; Holland et al. (1978). J. Biol. 
Ghent 256:1385). including the promoter for 3 phosphoglycerate kinase (Hitzeman (1980), J. Bid. Chem. 255:2073). 
Terminators may ateo be included, such as those derived from the enotase gene (Holland (1981). J. Biol. Chem. 
256:1385). Particularly useful control systems are those which comprise the glyceraldehyde-3 phosphate dehydroge- 
nase (GAPDH) promoter or alcohol dehydrogenase (ADH) regulatabie promoter, terminators also derived from 
GAPDH, and if secretion is desired, leader sequence from yeast alpha factor. In addition, the transcriptional regulatory 
region and the transcriptional initiation region which are operabty linked may be such that they are not naturally associ- 
ated in the wild-type organism. These systems are described in detail in EPOPub.Na 120,551; EPO Pub. No. 116.201; 
and EPO Pub. No. 164,556 all of which are incorporated herein by reference. 

[0125] Mammalian cell fines available as hosts for expression are known in the art and include many immortalized 
cell Ones available from the American Type Culture Collection (ATCC), including HeLa cells, Chinese hamster ovary 
(CHO) cells, baby hamster kidney (BHK) cells, and a number of other cell lines. Suitable promoters for mammalian cells 
are ateo known in the art and include viral promoters such as that from Simian Virus 40 (SV40) (Fters (1978), Nature 
23:1 13), Rous sarcoma virus (RSV). adenovirus (ADV), and bovine papilloma virus (BPV). Mammalian cells may also 
require terminator sequences and poly A addition sequences; enhancer sequences which increase expression may 
also be included, and sequences which cause amplification of the gene may also be desirable. These sequences are 
known in the art Vectors suitable for replication in mammalian cells may include viral replicons, or sequences which 
insure integration of the appropriate sequences encoding NANBV epitopes into the host genoma 
[01 26] The vaccinia virus system can ateo be used to express foreign DNA in mammalian cells. To express heterolo- 
gous genes, the foreign DNA is usually inserted into the thymidine kinase gene of the vaccinia virus and then infected 
cells can be selected. This procedure is known in the art and further information can be found in these references 
[Mackett et al. J. Virol. 43: 857-864 (1984) and Chapter 7 in DNA Cloning. Vol. 2. IRL Press]. 
[0127] In addition, viral antigens can be expressed in insect cells by the Baculovirus system. A general guide to bac- 
ulovirus expression by Summer and Smith is AMflTJUal of Methods for Baculovirus Vectors and Insect Cell Culture Pro- 
OSSbBSS. (Texas Agricultural Experiment Station Bulletin Na 1555). To incorporate the heterologous gene into the 
Baculovirus genome the gene is first cloned into a transfer vector containing some Baculovirus sequences. THis trans- 
fer vector, when it is cotransfected with wild-type virus into insect cells, will recombine with the wild-type virus. Usually, 
the transfer vector will be engineered so that the heterologous gene will disrupt the wild-type Baculovirus polyhedron 
gena This disruption enables easy selection of the recornbinant virus since the cells infected with the recombinant virus 
will appear phenotypically different from the cells infected with the wild-type virus. Trie purified recombinant virus can 
be used to infect cells to express the heterologous gene. The foreign protein can be secreted into the medium if a signal 
peptide is linked in frame to the heterologous gene; otherwise, the protein will be bound in the cell tysates. For further 
information, see Smith et al Mol. & Cell. Biol. 3:2156-2165 (1 983) or Luckow and Summers in Virology 12: 31 -39 (1989). 
[0128] Transformation may be by any known method for introducing polynucleotides into a host cell, including, for 
example packaging the polynucleotide in a virus and transducing a host cell with the virus, and by direct uptake of the 
potynucleotida The transformation procedure used depends upon the host to be transformed. Bacterial transformation 
by direct uptake generally employs treatment with calcium or rubidium chloride (Cohen (1972), Proa Natl. Acad. Sd. 
USA S&21 10; Maniatis et al. (1982), MOLECULAR CLONING; A LABORATORY MANUAL (Cold Spring Harbor Press. 
Cob Spring Harbor, N.Y.). Yeast transformation by direct uptake may be carried out using the method of Hinnen et al. 
(1978) Proc. Natl. Acad. Sci. USA 7§: 1929. Mammalian transformations by direct uptake may be conducted using the 
calcium phosphate precipitation method of Graham and Van der Eb (1 978), Virology ££546 a the various known mod- 
ifications thereof. 

[01 29] Vector construction employs techniques which are known in the art Site-specific DNA cleavage is performed 
by treating with suitable restriction enzymes under conditions which generally are specified by the manufacturer of 
these commercially available enzymes. The cleaved fragments may be separated using polyacrylamide or agarose gel 
electrophoresis techniques, according to the general procedures found in Methods in Enzymotogy (1980) 55:499-560. 
Sticky ended cleavage fragments may be blunt ended using £. &|i DNA polymerase I (Klenow) in the presence of the 
appropriate deoxynucleotide triphosphates (dNTPs) present in the mixture. Treatment with S1 nuclease may ateo be 
used, resulting in the hydrolysis of any single stranded DNA portions. 

[0130] Ligations are carried out using standard buffer and temperature conditions using T4 DNA ligase and ATP; 
sticky end ligations require less ATP and less ligase than blunt end ligations. When vector fragments are used as part 
of a ligation mixture, the vector fragment is often treated with bacterial alkaline phosphatase (BAP) or calf intestinal 
alkaline phosphatase to remove the 5*-phosphate and thus prevent religation of the vector; alternatively, restriction 
enzyme cfigestion of unwanted fragments can be used to prevent ligation. Ligation mixtures are transformed into suita- 
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He cloning hosts, such as £. ccji, and successful transfdrmants selected by, for example, anttokrtic resistance, and 
screened for the correct construction. 

[0131] Synthetic oligonucleotides may be prepared using an automated oligonucleotide synthesizer as described by 
Warner (1 984), ONA 3:401 . If desired, the synthetic strands may be labeled with by treatment with polynucleotide 
5 kinase in the presence of 32 P-ATF> using standard conditions fa the reaction. ONA sequences, including those isolated 
from cDNA libraries, may be modified by known techniques, including, for example site directed mutagenesis, as 
described by Zoller (1982). Nucleic Acids Res. 1&6487. 

[0132] ONA libraries may be probed using the procedure of Grunstein and Hogness (1975), Proc. Natl. Acad. So. 
USA 3:3961. Briefly, in this procedure, the DNA to be probed is immobilized on nitrocellulose filters, denatured, and 

10 prehybridized with a buffer. The percentage of fbrmamide in the buffer, as well as the time and temperature conditions 
of the prehybridization and subsequent hybridization steps depends on the stringency required. CXigomeric probes 
which require lower stringency conditions are generally used with low percentages of fbrmamide, lower temperatures, 
and longer hybridization times. Probes containing more than 30 or 40 nucleotides such as those derived from cDNA or 
genomic sequences generally employ higher te mp er a tures, e.g., about 40-42°C. and a high percentage, e.g., 50%, for- 

15 mamide. Following prehybridization, S^P-labeled oligonucleotide probe is added to the buffer, and the filters are incu- 
bated in this mixture under hybrkfization conditions. After washing, the treated filters are subjected to autoradiography 
to show the location of the hybridized probe; DNA in corresponding locations on the original agar plates is used as the 
source of the desired DNA. 

[0133] An enzyme-linked immunosorbent assay (EUSA) can be used to measure either antigen or antibody concen- 
20 trations. This method depends upon conjugation of an enzyme to either an antigen or an antibody, and uses the bound 
enzyme activity as a quantitative label. To measure antibody, the known antigen is fixed to a solid phase (e.g., a micro- 
plate or plastic cup), incubated with test serum dilutions, washed, incubated with antinrnmunoglobulin labeled with an 
enzyme, and washed again. Enzymes suitable for labeling are known in the art, and include, for example, horseradish 
peroxidase. Enzyme activity bound to the solid phase is measured by adding the specific substrate, and determining 
25 product formation or substrate utilization cotorimetricaJly. The enzyme activity bound is a direct function of the amount 
of antfoody bound. 

[01 34] To measure antigen, a known specific antibody is fixed to the solid phase, the test material containing antigen 
is added, after an incubation the solid phase is washed, and a second enzyme-labeled antibody is added. After wash- 
ing, substrate is added, and enzyme activity is esti mate d ccJorimetricaJly, and related to antigen concentration. 

30 

Examples 
i 

35 [0135] This example describes the cloning of the HCV/J1 and HCV/J7 nucleotide sequences. 

[0138] Both Hood samples which were used as a source of HCV virions were found to be positive in an anti-HCV 
antibody assay. THe HCV isolates from these samples were named HCV/J1 and HCV/J7. The infectivity of the blood 
sample containing the J1 isolate was confirmed by a prospective study of blood transfusion recipients. Dr. Tortru 
Katayama from the Department of Surgery at the National Tokyo Chest Hospital collected blood from patients who have 

40 contracted post-transfusion non-A, non-B hepatitis. He also collected blood samples from the respective blood donors 
of these patients. Next these samples were assayed for antibodies to the C 100-3 HCV1 antigen (EPO Pub. No. 
318,21 6), and blood from one of the donors was found to be positive, 

[01 37] Isolation of the RNA from the blood samples began by pelleting virions in the blood sample by ultracentrifuga- 
tion [Bradley. D.W., McCaustland, K.A., Cook E.H.. Schable, C.A.. Ebert, J.W. and Maynard. J.E. (1985) Gastroenterol- 
45 ogy 38* 773-779]. RNA was then extracted from the pellet by the guanidiniunVcestum chloride method [Maniatis T, 
Fritsch, E.F.. and Sambrook J. (1982) "Molecular Cloning: A Laboratory Manual". Cold Spring Harbor Laboratory, Cold 
Spring Harbor] and further purified by phenol/chloroform extraction in the presence of urea, [Berk, A.J. Lee.F., Harrison, 
T, Williams, J. and Sharp, PA (1979) Cell 17. 935-944]. 

[01 38] Frve pairs of synthetic oligonucleotide primers were designed from the C/E. E, E/NS1 , NS3, and NS5 domains 
so of the nucleotide sequence of HCV1 to isolate fragments from the J1 and J7 genome. The first set of primers were to 
isolate the sequence from the core and some of the envelope domain. The second set of primers were to isolate the 
sequences in the envelope domain. The third set of primers were to isolate a fragment which overlapped the putative 
envelope and non-structural one, NS1 , domains. The fourth and fifth set of primers were used to isolate fragments from 
non-structural domains three and five, NS3 and NS5. The sequences for the various primers are shown below: 
55 The sequence of the primers for the C/E region were: 
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2 IS 5' CGTGCCCCCGCAAGACTGCT 3' 

J80A 5' CCGTCCTCCAGAACCCGGAC 3' 

5 

The sequence of the primers for the E region were: 

10 7 IS 5' GCCGACCTCATGGGGTACAT 3' 

J132A 5' AACTGCGACACCACTAAGGC 3' 

is The sequence of the primers for the E/NS1 region were: 

127 S 5' TGGCATGGGATATGATGATG 3' 

166A 5' TTGAACTTGTGGTGATAGAA 3' 

20 

The sequence of the primers for the NS3 region were : 

4 64S 5' GGCTATACCGGCGACTTCGA 3' 

25 

526A 5' GACATGCATGTCATGATGTA 3' 



The sequence of the primers for the NS5 region were: 

8 7 OS 5' GCTGGAAAGAGGGTCTACTA 3' 

917A 5' GTTCTTACTGCCCAGTTGAA 3' 



[0189] 1 fig of the antisense primers. 166A, 526A. or 917A, was added to 10 units of reverse transcriptase (Biorad) 
to synthesize cONA fragments from the isolated RNA as the template. The cDNA fragments were then amplified by a 
standard polymerase chain reaction [SaiW, R.K., Scharf. S., Faloona, F., MuJfis, K.R. Horn G.T., Erifch, HA, and Arn- 
40 heim. N. (1985) Science 220. 1350-1354] after 1 fig of the appropriate sense primer, 21 S. 71S, 127S, 464S or 870S. 
was added. 

[0140] The cONA fragments amplified by the PCR method were gel isolated and cloned by blunt-end ligation into the 
Smal site of pUC1 1 9 [Vieira. J. and Messing, J. (1 987) Methods in Enzymoiogy 153, 3-1 1] or into the SnaBI site of cha- 
romid SB. a derivative of the cloning vector charomid 9-42 [Saitot I. and Stark, a (1 986) Proa Natl. Acad. Sci. USA 
45 8664-8668). Clones which contain the fragments of the five viral domains were successfully constructed. 

11 

[0141] From the PCR reaction of the Japanese isolates, J1 and J7, three independent clones from each region, C/E, 
so E, E/NS1 , NS3, and NS5, have been sequenced by the dideoxy chain termination method. 

[0142] Sequence from all regions except C/E has been isolated from the J1 isolate. Sequence from only the C/E 
region has been isolated from the J7 isolate. Surprisingly, fragments isolated from both isolates are neither longer or 
shorter than what would be predicted from the HCV1 genoma However, there is heterogeneity between clones con- 
taining sequence from the same region. Consequently, a consensus sequence was constructed for each of the 
55 domains. C/E, E, E/NS1, NS3 and NS5, as shown respectively in Figures 1 through 5. These differences may be 
explained as artifacts which occur randomly during the PCR amplification [Saiki, R.K., Scharf. $.. Faloona, F, Mullts. 
K.B., Horn, G.T., Erttch, HA. and Amheim, N. (1985) Science 23L 1350-1354]. Another explanation is that more than 
one virus genome is present in the plasma of a single healthy carrier and that these genomes are heterogeneous at the 
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nucleotide Iwel. 

[0143] To clarify this point it was determined hew many of these nucleotide differences would lead to amino acid 
changes, using the sequence from the NS3 domain of the J1 isolate as an example. Out of the five nucleotide differ- 
ences, three fall on the third position of the amino acid codon and do not change the amino acid sequence. Both of the 
s remaining two nucleotide changes fall on the first position of the amino acid codon and generate amino acid changes 
of threonine to alanine and proline to alanine, all of which are small, neutral amino acid residues. Similarly, when ana- 
lyzing the nucleotide differences in other domains, many silent and conserved mutations are found. These results sug- 
gest that nucleotide sequences of the HCV genomes in the plasma of a single healthy donor are heterogeneous at the 
nucleotide level. 

io [0144] In addition, once the consensus sequences for each of the fragments were compiled each sequence was com- 
pared to the HCV1 isolate in Figures 6 through 1 0. In Figure 6 the fragment from the C/E region of the J7 isolate shows 
a 92.8%. 512/552, nucleotide and 97.4%, 150/154, amino acid homology to the HCV1 isolate. The fragment from the E 
domain of J1 shows a slightly lower nucleotide and amino acid homology to HCV1 in Figure 7 of 76.2% and 82.9%, 
respectively. The fragment from the J1 isolate which overlaps the envelope and non-structural one domains shows the 

15 lowest homology to HCV1, as seen in Figure 8. where the J1 isolate has a 71.5% nucleotide homology and a 73.5% 
amino acid homology to HCV1 . Figure 9 shows a comparison of the fragment from the NS3 domain of J1 toHCVLThe 
homology between the nucleotides sequences is 79.8%, while the amino acid homology between the isolates is quite 
high, 92.2% or 179/194 amino acids. Figure 10 shows the homology between the NS5 sequences from J1 and HCV1 . 
The sequences have a 84.3% nucleotide and 88.7% amino acid homology. 

20 [01 45] The vectors described in the examples abeve were deposited with the Patent Microorganism Depository, Fer- 
mentation Institute, Agency of Industrial Science and Technology at 1 -3, Higashi 1 -chome Tsukuba-chi, Ebaragiken 305, 
Japan, and will be maintained under the provisions of the Budapest Treaty. The accession numbers and dates of the 
deposit are listed below, on page 68. 

25 HI 

[0148] An HCVAJ1 done, J1-1519, was isolated using the essentially the techniques described abeva However, the 
primers used in the isolation were J 159S and 199 A. The sequences of the digomeric primers J159S and 199A, which 
follow, were based upon those in J1-1216 and in HCV1. 

30 

J159S 5' ACT GCC CTG AAC TGC AAT GA 3' 

19 9 A 5' AAT CCA GTT GAG TTC ATC CA 3' 

35 

[0147] Clone J1 -1519 is comprised of an HCV cDNA sequence of 367 nucleotides which spans most of the 5*-half of 
the NS1 region and which overlaps the E-region clone, J 1-1 216, by 31 nucleotides. Three independent clones spanning 
this region were sequenced; the sequences in this region obtained from the three clones were identical. The sequence 

40 of the HCV cONA in J1 -1 21 6 (shown in the figure as J1) and the amino acids encoded therein (shown above the nucle- 
otide sequence) are shown in Figure 13. Figure 13 also shows the sequence differences between J1-1216 in the com- 
parable region of the prototype HCV1 cDNA (indicated in the figure as FT), and the resulting changes in the encoded 
amino acids. The homology between the J1-1216 and HCV1 cDNA is approximately 70% at the nucleotide level, and 
about 75% at the amino acid level. 

45 [0148] A composite of the sequences from the putative core to NS1 region of the J 1 isolate is shown in Figure 14; 
also shown in the figure are the amino acids encoded in the J1 sequence. The variation from the HCV1 prototype 
sequence is shown in the line below the J1 nucleotide sequence; the dashed lines indicate homologous sequences. 
The nonhomologous amino acid encoded in the HCV1 prototype sequence is shown below the HCV1 nucleotide 
sequence. 

so [0149] Cloned material containing the J1/1519 HCV cDNA (pS1-1519) has been maintained in DH5o, and deposited 
with the Patent Microorganism Depository. 

SI 

55 [0150] Several regions of the J1 isolate, including the C200-C100 region from the putative NS3-NS4 region (which 
encompasses the region encoding the 5-1-1 polypeptide in HSV1 (See EPO Pub. No. 318,216), and the putative NS1 
- E region, were amplified using the PCR method The C200-C100 region includes nucleotides 3799 to 5321 of the pro- 
totype HCV1 . RNA was extracted as descrfoed above, except that extraction was with guanidinium thiocyanate in the 
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presence of Proteinase K and sodium dodecyteuHate (SDS) (Maniatis (1982), supra). The RNA was transcribed into 
HCV cDNA by incubation in a 25 nl reaction comprised of 1 jiM of each primer, 40 units of RNase inhibitor (RNASIN), 
5 units of AMV reverse transcriptase, and salts and buffer necessary for the reaction. Amplification of a segment of the 
HCV cDNA from the designated region was performed utilizing pairs of synthetic oligomer 16-mer primers. PCR ampli- 
fication was accomplished in three rounds (PCR I. PCR II. and PCRIII). The second and third rounds of PCR amplifica- 
tion (PCR II) utilized different sets of PCR primers; the first PCR reaction was diluted 10-fold and multiple rounds of 
PCR amplification were carried out with the new primers, so that ultimately up to 50% of the products of the first PCR 
reaction (PCR I) were reamplif ied. The primers used for the amplification of the regions were the following. These prim- 
ers, with the exception of J1C200-3 which was derived from the J1 isolate sequence, were derived from the prototype 
HCV1 sequence. 

Primers for ampltfication of the "5-1-r region from NS3-NS4. 

PCR I 

[0151] 

511/16A (sense, derived from nucleotides starting at number 1528 of HCV1) 

S AAC AQQ CTQ CQT GOT C 3* 
511/16B (anti-sense, derived from nucleotides ending at 5260 of HCV1) 

5* ACT TGG TCT GGA CAQ C 3 ( 

PCR II 
[0152] 

511/35A (sense, the HCV portion derived from nucleotides starting at number 5057 of HSV1; the restriction 
enzyme site is underlined) 

5 £HSMH£ TCG TCT TQT CCQ GQA AGC CGQ CAA TC 3* 
51 1 /35P (anti-sense, the HCV portion derived from nucleotides ending at number 5233 of HSV1 ; the restriction 
enzyme site is underlined) 

S CTTGAATTC CCT CTG CCT QAC GGG ACG CGG TCT GC 3' 

PCRIII 
[0153] 

5UflSA (see supra) 

VSNrc7 (antisense. derived from nucleotides ending at number 5804 of HSV1) 
5 QTA QTQ COT GGG QQA AAC AT 3* 

Primers for amplication of th e "NS1/E" region 

PCR I 

[0154] 

J 1(52)3 (sense, the HCV portion derived from nucleotides starting at number 953 of HSV1, the restriction enzyme 
site is underlined) 

5* CTTAGAATTC TGG CAT GGQ ATA TGA TGA TG 3' 

(sense, the HCV portion derived from nucleotides starting at number 1087 of HSV1 , the restriction enzyme 
site is underlined) 

5 CTTAGAATTC TCC ATG GTG GGG AAC TGG GC 3' 

J1rc12 (anti-sense, the HCV portion derived from nucleotides ending at 1995 of HSV1 , the restriction enzyme site 
is underlined) 

5 1 CHQMIIC TAA CGG GCT GAG CTC GGA 3' 

JllCia (anti-sense, the HCV portion derived from nucleotides ending at 1 941 of HSV1 , the restriction enzyme site 
is underlined) 
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S CTTAGAATTC CGT CCA GTT GCA GGC AGCTTC3' 

PCR II 
[0155] 

Jlrc13 (see supra) 

JHZ-1 (sense, the HCV portion is derived from nucleotides starting at number 1641 of HCV1, the restriction 
enzyme site is underlined) 

S CTTGAATTC CAA CTG GTT CGG CTG TAG A3' 
J1 IZ-2 (sense, the HCV portion is derived from nucleotides starting at number 1596 of HCV1, the restriction 
enzyme site is underlined) 

5 TGA GAC GGA CGT GCT GCT CCT 3' 

Primers for the C200-C1 00 region of the "NS3-NS4" region 

PCR I 

[0156] 

J1C2Q0-1 (sense, derived from nucleotides starting at number 3478 of HCV1) 

5* TCC TAC TTG AAA GGC TC 3* 
J1C200-3 (anti-sense, derived from nucleotides ending at number 4402 of HCV1) 

5* GGA TCC AAG CTG AAA TCG AC 3' 
J1rcS8 (anti-sense, the HCV portion derived from nucleotides ending at 5853 of HCV1, the restriction enzyme site 
is underlined) 

5 1 CTTAGAATTC GAG GCT GCT GAG ATA GGC AGT 3* 
5HZlfiA(see above). 

PCR II 

[0157] 

J1C2QQ-2 (sense, the HCV portion derived from nucleotides starting at number 3557 of HCV1, the restriction 
enzyme site is underlined) 

S CTTGAATTC CCC GTG GAG TGG CTA AGG CGG TGG ACT 3' 
J1C200-4 (anti-sense, the HCV portion derived from nucleotides ending at 4346 of HCV1, the restriction enzyme 
site is underlined) 

5* CTTGAATTC TCG AAG TCG CCG GTA TAG CCG GTC ATG 3' 
5 11/35 A (see above) 

J1rc§1 (anti-sense, the HCV portion derived from nucleotides ending at 5826 of HCV1, the restriction enzyme site 
is underlined) 

5 CTTAGAATTC GGC AGC TGC ATC GCT CTC CGG CAC 3' 

The amplified HCV cDNAs were either sequenced directly without cloning, and/a were cloned. Sequencing 
was accomplished using an assymetric PCR technique, essentially as described in Shyamala and Ames, J. Bacte- 
riology 121:1602 (1989). In this technique, amplification of the cONA is carried out with a limiting concentration of 
one of the primers (usually in a ratio of about 1 :50) in order to get preferential amplification of one strand. The pref- 
erentially amplified strand is then sequenced by the dideoxy chain termination method. 

The primers used for assymetric sequencing by the PCR method were the following. For the NS1 region: J1 IZ- 
1 and J1rc13 (sequenced with both); J1IZ-2. J1rc13 (confirmed on both strands). For the NS3-NS4 region, which 
includes the C200-C100 N-terminal region, C200-C100 C-terrrinal region, and the 5-1-1 region: J1C200-2 and 
J1C200-7 (tor the N-terminal region of C200-C100), and J1C20XM and J1C200-6 (for the C200-C100 C-terrrenal 
region); and 51 1/35 A and hep 4 (for the 5-1-1 region). The sequences for J1C200-2, J1C200-4. and 51 1/35 A are 
shown supra; the sequences of hep 4, J1C200-6. and J1C200-7 are the following. 
heo4 (derived from nucleotides starting at number 5415 of HCV1) 

5 TT GGC TAG TGG TTA GTG GGC TGG TGA CAG 3* 
J1C3Q0-6 (the HCV portion derived from nucleotides starting at number 3875 of HCV1 . the restriction enzyme site 
is underlined) 
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5* CTTGAATTC CGT ACT CCA CCT ACQ GCA AQT TCC TT 3' 

J1C200-7 (the HCV portion derived from nucleotides starting at number 3946 of HCVt , the restriction enzyme site 
is underlined) 

5* CTTGAATTC GTG GCA TCC GTG GAG TGG CAC TCG TC 3' 

5 

[0158] The sequences obtained by assymetric sequencing of the "NS1" region, the C200-C1QO region, and the 5-1- 
1 region are shown in Figure 15. and Figure 16. respectively. In the figures, the amino acids encoded in the J 1 
sequence are shown above the J1 nucleotide sequence. The differences between the J1 sequence and the HCV1 pro- 
totype nucleotide sequence is shown below the J1 sequence (the dashes indicate homologous nucleotides in both 
10 sequences). The encoded amino acids which differ in the HCV1 prototype sequence are shown below the HCV1 nucle- 
otide sequence. 

[0159] HCV cDNAs from the NS1 region, the C200-C100 region, and the 5-1-1 region were cloned. A 300 bp and a 
230 bp fragment from the putative NS1 region, were cloned into a derivative of the commercially available vector, 
DGEM-3Z, in host HB101, and deposited with the ATCC as AW-300bp, The derivative vectors maintain the original 

15 pGEM-3Z poiytinkers, an intact Amp' gene, and the genes required for replication in £. coJL The HCV cDNA fragments 
may be removed with Sad and Xbal. HCV cONAs containing 770 bp N-terminal fragments of C200 were cloned into 
pM1 E in HB101 . 1 2 clones were pooled and deposited with the ATCC as AW-770bp-N; the HCV cDNA may be removed 
from the vector with Haell. The resultant Haell fragment will contain vector DNA of 300 bp and 250 bp at the 5* and 3* 
ends, respectively. HCV cDNAs containing 700 bp C-terminal fragments of C200 (AW-700bp-C) were cloned into 

20 M1 3mp1 0 and maintained in host DH5a-P; cloning was into the vector poly! inker she. The resultant phage were pooled, 
and deposited with the ATCC on September 1 1, 1990 as AW-700bp-N or AW-700bp-C. HCV cONA from J1 equivalent 
to the 5-1 -1 region of HCV1 was cloned into mp1 9 R1 site, and maintained in DH5ct-F. Several ml 3 phage superanants 
from this cloning were pooled and deposited with the ATCC as J 1 5-1-1 , on September 11,1 990. The HCV cDNAs may 
be obtained from the phage by treatment with EcoRI. Accession numbers for J1 5-1-1 and AW-700bp-N or AW-700bp- 

25 C may be obtained by telephoning the ATCC at (301 ) 881 -2600. 

[0160] The above-described cloned material was deposited with the American Type Culture Collection (ATCC). 

30 [0161] An HCV cDNA library containing sequences of the putative "NS1" region of the J 1 isolate was created by direc- 
tional cloning in Vgt22. The "NS1 " region extends from about nucleotide 1460 to about nucleotide 2730 using the num- 
bering system of the HCV1 prototype nucleic acid sequence, where nucleotide 1 is the first nucleotide of the initialing 
methionine codon for the putative poryprotein. The cloning was accomplished using essentially the method descrtoed 
by Han and Rutter in GENETIC ENGINEERING, Vol 10 (J.K. Settow, Ed.. Plenum Publishing Co., 1988). except that 

35 the primers for the synthesis of the first and second strand of HCV cDNA were JHC67 and JHC68, respectively, and the 
source of RNA was the J1 plasma. In this method the RNA is extracted with guanicSum thiocyanate at a low tempera* 
tura The RNA is then converted to full length cDNA. which is cloned in a defined orientation relative to the jacl pro- 
moter in X-phaga Using this method, the HCV cDNAs to J1 RNA were inserted into the Notl site of X-gt22. The 
presence of "NS1 " sequences in the library was detected using as probe, Alx54. 

40 [0162] The sequence of a region of "NSr downstream from the region shown in Figure 14. but which overlaps the 
region by about 20 nucleotides, was determined using the assymetric sequencing technique deserted above, but sub- 
stituting as primers for PCR amplification. Alx 61 and Abe 62. The resulting sequence is shown in Figure 17. (It should 
be noted that the PCR amplification was of a region from about nucleotide 1930 to about nucleotide 2340; this region 
is also encompassed in the sequence shown in Figure 15). The sequences of the primers and probes used to obtain 

45 the HCV cDNA library in X-gt22, and to sequence the portion of the "NS1 " region were the following. 

JHC 67 

S GACGC GGCCG CCTCC GTGTC CAGCG CGT 3' 
JHC 68 

so 5* CGTGC GGCCG CAAGA CTGCT AGCCG AGGT 3* 

5* ACCTG CCACT GTGTA GTGGT CAGCA GTAAC 3' 
ALX 62 

5* ACGGA CGTCT TCGTC CTTAACAATA CCAGG 3* 
55 ALX 54 

S GAACT TTGCG ATCTG GAAGACAGGG ACAGG 3' 
[0163] A 400 bp fragment of J1 HCV cDNA derived from the sequenced region was cloned into pGEM3z and main- 
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tained in HB101 ; the HCV cDNA may be removed from the vector with Sacl and Xbal. Host cells transformed with the 
vector (JH-400bp) have been deposited with the ATCC. 

10164] A pooled cDNA library was created from the J 1 serum; the pooled library spans the J 1 genome and is identified 
as HCV-J1 X gt22. The pooled cDNA library was created by pooling aliquots of 1 1 individual cONA libraries, which had 
5 been prepared using the directional cloning technique described above, except that the Iforaries were created from 
primers which were designed to yield HCV cDNAs which spanned the genome. The primers were derived from the 
sequence of HCV1 . and included JHC 67 and JHC 6a The HCV cDNAs were inserted into the Notl site of X^gt22. The 
pooled cDNA Ibrary, HCV-J1 X $22, has been deposited with the ATCC. 

10 VI 

[0165] The sequence of a region of the polynucleotide upstream of that shown in Figure 14 was determined. This 
region begins at nucleotide -267 with respect to the HCV1 (See Figure 12) and extends for 560 nucleotides. Sequenc- 
ing was accomplished by preparing HCV cDNA from RNA extracted from J1 serum, and amplifying the HCV cDNA 
is using the PCR method. 

[0166] RNA was extracted from 100 of serum following treatment with proteinase K and sodium dodecytsuffate 
(SDS). The samples were extracted with phenol^hJoroform, and the RNA precipitated with ethanol. 
[0167] HCV cDNA from the J1 isolate was prepared by denaturing the precipitated RNA with 0.01M MeHgOH; after 
ten minutes at room temperature, 2-meraptoethanol was added to sequester the mercury ions. Immediately, the mix 

20 for the first strand of cDNA synthesis was added, and incubation was continued for 1 hr at 37°C. The conditions for the 
synthesis of the anti-sense strand were the following: 50 mM Tris HO, pH a3. 75 mM KCJ, 3 mM MgCI* 1 0 mM dithio- 
threitol. SOOjiM each deoxynudeotide triphosphate, 250 pmol specific antisense cDNA primer r25. 250 units MMLV 
reverse transcriptase. In order to synthesize the second strand (sense), the synthesis reaction components were 
added, and incubated tor one hour at 14°C. The components for the second strand reaction were as follows: 14 mM Tris 

25 HO, pH 8.3, 68 mM KG, 7.5 mM ammonium sulfate, 3.5 mM MgCfe, 2.8 mM ditNothrertol, 25 units DNA polymerase I, 
and one unit RNase H. The reactions were terminated by heating the samples to 95°C for 1 0 minutes, followed by cool- 
ing on tea 

[0168] The HCV cONA was amplified by two rounds of PCR. The first round was accomplished using 20 \i\ of the 
cDNA mix. The conditions for the PCR reaction were as follows: 10 mM Tris HQ, pH 8.3. 50 mM KCI, 1.5 mM MgCfe, 

30 0.002% g elating, 200 mM each of the deoxynudeotide triphosphates, and 2.5 units Amplitaq. The PCR thermal cycle 
was as follows: 94°C one minute, 50°C one minute, 72°C one minute, repeated 40 times followed by seven minutes at 
72°C. The second round of PCR was accomplished using nested primers (i.a primers which bound to an internal region 
of the first round of PCR amplified product) to increase the specificity of the PCR products. One percent of the first PCR 
reaction was amplified essentially as the first round, except that the primers were substituted, and the second step in 

as the PCR reaction was at 60°C instead of 50°C. The primers used for the first round of PCR were ALX90 and r14. The 
primers used for the second round of PCR were r14 and p14. 

[0169] The sequences of the primers for the synthesis of HCV cDNA and for the PCR method were the following. 
r25 

40 S ACC TTA CCC AAA TTQ CQC QAC CTA 3' 

ALX90 

S CCA TQA ATC ACT CCC CTQ TQA GQA ACT A 3' 

M4 

5* QGQ CCC CCAG CTA QQC CQA QA 3* 

45 p14 

5, AAC TAG TQT CTT CAC GCA QAA AQC 3* 

[0170] The PCR products were gel purified, the material which migrated as having about 615 bp was isolated, and 
sequenced by a modification of the Sanger dideoxy chain termination method, using ^P-ATP as label. In the motffted 
so method, the sequence replication was primed using P32 and R31 as primers; the double stranded DNA was melted for 
3 minutes at 95°C prior to replication, and the synthesis of labeled dideoxy terminated polynudeotides was catalyzed 
by Bst polymerase (obtained from BioRad Corp.), according to the manufacturer's directions. The sequendng was per- 
formed using 500ng to 1 ng of PCR product per sequendng reaction. 

[0171] The primers P32 (sense) and R31 (antisense) were derived from nudeotides -137 to -115 and from nude- 
55 otides 192 to 1 73, respectively, of the HCV1 sequence. The sequences of the primers are the following. 

P32 primer 

5* AAC CCQ CTC AAT GCC TGQ AQA TT 3' 
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R31 primer 

5' GGC CGX CGA GCC TTG GGG AT 3' 
where X » A or G 

[0172] The sequence of the region in the J1 isolate which encompasses the S'-untranstated region as well as a part 
of the region of the putative "Core" is shown in Figure 18. In the figure, amino acids encoded in the J1 sequence are 
shown above the nucleotide sequence. The sequence of the prototype HCV1 is shown below the J1 sequence; the 
dashes indicate sequence homology with J1. The differing amino adds encoded in the HCV1 sequence are shown 
below the HCV1 sequence. 

[0173] An HCV cDNA fragment which is a representative of the 600 bp J1 sequence described above (TC 600bp) 
was cloned into pGEM3Z and maintained in host HB101 ; the HCV cDNA fragment may be removed with Sacl and Xbal. 
This material is on deposit with the ATCC. 

Patent Microorganism Depository-deposited under Budapest Treaty terms. 
[0174] 



Deposited Materials 


Accession Number 


Deposit Date 


E. cdi DH5/bS1-8791a 


BP-2593 


9/15/1989 


(This done contains 427 bp of the HS5 domain of J1 ) 


E. cdi HB101APU1-1216C 


BP-2594 


9/15/1989 


(This done contains 351 bp of the E/NS1 domains of J1) 


g, rfj HBl01/pU1-4652d 


I BP-2595 


9/15/1989 


(This done contains 583 bp of the NS3 domain of J1) 


E t Wij DH5a/t>S1-713c 


I BP-2637 


1 11/1/1989 


(This done contains 580 bp of the E domain of J1) 


E. cdi DH5a/pS7-28c 


I BP-2638 


1 11/1/1989 


(This done contains 


552 bp of the C/E domain of J7) 


E^fiDH5a^6l-1519 


| BP3081 


| 8/30/90 



[0175] The following vectors descrtoed in the Examples were deposited with the American Type Culture Collection 
(ATCC), 12301 ParMawn Dr., Rockville, Maryland 20852. and have been assigned the following Accession Numbers. 
The deposits were made under the terms of the Budapest Treaty. 



Deposited Materials 


Accession Number 


Deposit Date 


TC-6008P (in E, GQlj HB101/J>GEM3Z) 


68393 


9/11/90 


JhMOObp (in E. coli HB101/pGEM3Z) 


68394 


9/11/90 


AW-300bp (in E. cofi HB101/|dGEM3Z) 


68392 


9/11/90 


AW-770bp-N (in E, gglj HB101/pMlE) 


68395 


9/11/90 


AW-700bp-C or AW-700bp-N (in E. cofi DH5a-F/M13mp10) 






J1 5-1-1 fin E. coli DH5a-P/M13mo10l 






HCV-J1 X $22 


40884 


9/6/90 



These deposits are provided for the convenience of those skilled in the art These deposits are neither an admission 
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that such deposits are required to practice the present invention nor that equivalent embodiments are not within the skill 
of the art in view of the present disclosure. The public availability of these deposits is not a grant of a license to make, 
use or sell the deposited materials under this or any other patent. The nucleic acid sequences of the deposited materi- 
als are incorporated in to present disclosure by reference, and are controlling if in confBct with any sequences described 
herein. 

[0176] While the present invention has been described by way specific examples for the benefit of those in the field, 
the scope of the invention is not limited as additional embodiments will be apparent to those of skill in the art from the 
present disclosure. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: CHIRON CORPORATION 
(ii) TITLE OF INVENTION: New HCV Isolates 
(iii) NUMBER OF SEQUENCES: 59 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: J. A. Kemp & Co. 

(B) STREET: 14 South Square Gray's Inn 

(C) CITY: London 

(D) STATE: 

(E) COUNTRY: United Kingdom 

(F) ZIP: WC1R 5LX 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: EP 90310149.1 

(B) FILING DATE: 17-SEP-1990 

(C) CLASSIFICATION: C12N 15/51 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Goldin, Douglas M. 

(C) REFERENCE/ DOCKET NUMBER: N* 61241-DMG/kst 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 071-405-3292 

(B) TELEFAX: 071-242-8932 

(C) TELEX: 23676 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
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CGTGCCCCCG CAAGACTGCT 



(2) INFORMATION FOR SBQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGIH: 20 base pairs 

(B) TOTE: nucleic acid 

(C) S1RANEECNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ 10 WD: 2: 
COCTCCTCCA GAAOCOGGAC 



(2) INFORMATION FOR SBQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LZNGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STOANDEENESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GCOGAGCTCA TQQGGTACAT 



(2) INFORMATION FOR SBQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) L£NGIH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
AACIGOGACA CCACIAAGGC 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGIH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
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TGGCATGGGA TATCATGATG 



(2) INFOFMAHCN FOR SBQ ID NO:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGIH: 20 base pairs 

(B) TYPE: nucleic add 

(C) STRANDECNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
TTGAACTTOT GGTGAXAGAA 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGIH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
GGCTAIAOOG GOGAI T1 UA 



(2) INFCKMATICN FOR SBQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GACATCCATC TCATCATGTA 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: single 
(0) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
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GCTGGAAAGA GGCTCTACTA 



(2) INEORMATICN FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) I^NGTCH: 20 base pairs 

(B) TYEE: nucleic acid 

(C) STRANEECNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GITCITACIG COCAGITGAA 



(2) INFC3FMATI0N FOR SEQ ID NO: 11: 

(1) SEQUENCE CHARACTERISTICS: 

(A) ISNGEH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEXNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
ACIGCCCTGA ACTGCAATGA 



(2) INFQPMATICN FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGOT: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO; 12: 
AAOCAGITGA GITCATCCA 
(2) INPOEPMATION FOR SEQ ID NO:13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) UENG7IH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) SIRANDEENESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
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AACAGGCPGC CTGCTC 



(2) INFORMATION FOR SBQ ID NO: 14: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANEEENESS: single 

(D) TOFOIOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 14: 
AGTTOCTCTC GACAGC 



(2) INTOEMOTCN FOR SBQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IBNGIH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
CITCAATTCT O J ll T imiT GGGAAGCOQG CAATC 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IENGEH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

CTTGAATTOC CICIGCCTCA OGGGAOGCGG TCTGC 



(2) INFORMATION FOR SBQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
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CTACTGCCTG GGGGAAACAT 



(2) INK3PMATICN FOR SBQ ID NO: 18: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENdH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANEEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 18: 
dTAGAAITC TQGCATGGGA TATCATGATC 



(2) INFCJWAIICN FOR SBQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) UNSIH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) SIRANCEENESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
CTTAGAATTC TCCATGCTGG GGAACTQGGC 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) I£NC7IH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) SIRANDEENESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
CTTAGAATTC OCTOCACTIG CAGGCAGCTT C 



(2) INFCGMATICN FOR SBQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) I£NGIH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 
CITGAA3TCC AACTGGTTOG GCTCTAC 
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(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IfNGOH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANUEENESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
TCAGAOGGAC GTGCVGCTCC T 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IENOTi: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANEECNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
TCCTACITCA AAGGCIC 



(2) INPOEFMATICN FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGIH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) SIRANEEENESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE INSCRIPTION: SEQ ID NO: 24: 
GGATOCAAGC TCAAATOGAC 



(2) INFCEWAITION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 
CTTAGAATTC GAGGCIGCTG AGATAGGCAG T 
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(2) INFORMATION FOR SBQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IENGIH: 36 base pairs 

(B) TOPE: nucleic acid 

(C) STRANDEENESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 26: 
CTTGAMTOC OOCTGGAGTC GCTAAGGCGG TGGACT 



(2) INFORMATION FOR SBQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IfNGIH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) SORANDECNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
CITCAATTCT OGAAGTOGOC GG7EATAGOOG GTCATG 
(2) INFORMATION FOR SBQ ID NO:28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENG»IH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) SIRANOECNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SBQ ID NO: 28 
CTEAGAAJTC GGCAGCTGCA TOGCXCTOOG GCAC 



(2) INFORMATION FOR SBQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IENGIH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE OESCRIPriON: SBQ ID NO: 29 
TTGGdMTG GTTAGTQGGC TGGTCACAG 
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(2) INFORMATION FOR SEQ 3D NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IENGIH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANEEENESS: single 

(D) TOPOLOGY: linear 

(3d) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
dTGAATTOC GTACTQCACC TAOGGCAAGT TCCTT 



(2) INFORMATION FOR SBQ ID NO:31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGIH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31 
CTTSAA1TCG TGGCATCCCT GGAGIGGCAC TOGTC 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 32 
GAOGOGGCOG CXSCOGOTIC CAGOGOGT 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) L£NGXH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33 
CGTCCGGOOG CAAGACTGCT AGCOGAGGT 



(2) INFOIWATION FOR SEQ ID NO: 34: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TOPE: nucleic acid 

(C) STRANDEENESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

ACCIGOCACT GICTAGIQGT CAGCAGIAAC 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGIH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 



AOQGACGTCT TOCTOCITAA CAATACCAGG 



(2) INFORMATION FOR SEQ ID NO:36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESOOTTTON: SEQ ID NO: 36: 
GAACITTGOG ATCIGGAAGA CAGGGACAGG 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IfNGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
ACCITACOCA AA2TCOGCGA CCTA 



(2) INFORMATION FOR SEQ ID NO:38: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGIH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) SIRANDECNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 38: 
OCATGAATCA CICOCCICTG AGGAACIA 
(2) INKSMATICN FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGHT: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STOANDEENESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
GGGOOOOCAG CTAGGCOGAG A 



(2) INFQPMATICN FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) I£NGIH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
AACTACTGOC TTCACGCAGA AAGC 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) L£NGXH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 
AAOOOGCTCA ATCCCTCGAG ATT 



(2) INFORMATION FOR SEQ ID NO:42: 
(i) SEQUENCE CHARACTERISTICS: 
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w 



(A) LENGIH: 20 base pairs 

(B) TOE: nucleic acid 

(C) STRANDEENESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
GGCCGRCGAG CCTIGGGGAT 20 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 
is (A) I£NGTH: 552 base pairs 

(B) TOPE: nucleic acid 

(C) STRANDECNESS: double 

(D) TOPOICGY: linear 

20 (xi) SEQUENCE EESO*IPITCN: SEQ ID NO: 43: 

OGAAAGGCCT TGTCCTACIG CCIGAEAGGG TGCITGOGAG 60 

TGOOOOQQGA GGTCTCGTAG ACCGTCCATC ATGAGCACAA ATCXTAAACC YCAAAGAAAA 120 

AOCAAAOSEA ACAGCAAGOG TOGCOCACAG GAOGTYAACT TCCCKDGOQG TGGTCAGATC 180 

GTYGGTOGAG TmCTTCTr GCCROGCAGG GGOCCCAGGT TCQGTCTGOG TGOGACTAGG 240 

AAGACTIOOG AGCOGTCRCA ACCTOCTSGA AGGOGACAAC CTATCCCCAA GGCTOGCOGG 300 

OOOGAGQGCA GGACCTOGGC TCAGCCTQGG TATCCTTGGC CCCTCTATGG CAATGAGGGC 360 

TWGQGGIGGG CAGGATGGCT CCICTCACCC OGOGGCICTC GGOCTAGTIG GGGOOCYAMT 420 

GAGOOCOOGC CTAGGTOGOG TAATTIGGGT AAGGTCATOG ATAGCCTTAC ATGCQGCVTC 480 

GOOGAOCTCA TGGGGEACAT YOOGCIYGTC GGOGCCCCCT TAGGGGGOGC TGCCAGGGCC 540 

CTQGCACAIG GT 552 
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(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 580 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
TCOGCTOGTC GGOGCCCCCT YAGGGGGCGC TGCCAGGGCC CTGGCACAIG GTCICOQGCT 60 
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TCTGGAGGAC GGOGTGAACT ATGCAACAGG GAATTTGCCC GCTIGCTCIT TCTCTATCIT 120 

OCTCTIRGCT YTCCICTCCT GTITSACCAT CCCAGCTTCC GCITATGAAG TCOGCAAOGT 180 

5 

CTOOGQGKEA TAYCATG7ICA CAAAOGACIG CICCAACTCA AGCATTCTGT ATCAOGOGGC 240 

GGACGTCATC A1GCATCCCC COGGGIGCCT GCCdGCGTT OGGGAGAACA AYTCCTCCCG 300 

10 T I G CTQGG TA GCQCTCACTC CCAOGCTOGC GGCCAGGAAT GCCAGCCTOC CCACEAGGAC 360 

AIIRGGAOGC CAOCTOGACT TG ClUal TGG GAGGGCIGCT TTCTGCTOOG CIATCTAOCT 420 

GGGQGATCIC TG03GATCIG TTITCdYAT CTCCCAGCIG TTCACC1TCT OGCCTOGCOG 480 

GCATGAGACA GTACAGGACT GCAACIGCTC AATCTATCCC GGCCAOGIAT CAGGCCATOG 540 

YATOGCTIGG GMATCATGA TGAACTCGIC GCCCAOGGCA 580 
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(2) INFORMATION FOR SBQ ID NO: 45: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGIH: 351 base pairs 
25 (B) TYPE: nucleic acid 

(C) STOANDEENESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

AACIGCTOGC CCACQGCAGC CTTACTGGTC TOGCAGTTAC TCCGGATCCC ACAAGCTGTC 60 

A3GGACATGG TCGOGGGGGC OCACTGGGGA GTCCTRGOGG GCCITGCCEA CEATTCCATC 120 

35 GIRQGGAACr GQGCEAMGT TITGATTCIG ATCCEACICT TTGCOGGCGT TGAOGGGMRT 180 

ACCOGOCTGA OGGGRGGGCT GCAAGGCCAY GICAOCTCIR CACICAOCTC CCICTITAGA 240 

OCTGGG G OGT COCAGAAAAT TCAGYYIKEA AACACCAATC GCAGTT3GCA TATCAACAGG 300 

ACTGCCCTGA AOTGCAATGA CTOCCTCCAA ACIGGCTICC TIGCOGOGCT G 351 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) I£NGIH: 583 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDECNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
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CICAGIGMC GACIGTCAACA CATCTCTCAC TCAGACGCTC GATTTCAGCr TGGATCCCRC 60 

CITCAOCAIC GAGAOGAOGA CCCTGOCOCA AGATGCGGIT TOGOGCAOGC AGOQGOGAGG 120 

5 

TAGGACIGGC AGGGGCAGGA GAGGCATCTA TAGGTITCIG ACTCCAGGAG AAOGGCCCTC 180 

GGOGATOITC GATTCTTOGG TCCTATCIGA CTGTTATGAC GCGGQUCTS CTIGGIATGA 240 

10 GCICAGGOOC GCTGAGACCT CGCTTAGGTT GOGGGCTIAC CTAAA3ACAC CAGGCTTGCC 300 

OGTCIGOCAG GACCATCIQG AGTICIGGGA GAGOGTCTTC ACAGGCCTCA OCCACAXAGA 360 

OGGOCACriC TTCTCOCAGA CTAAGCAGGC AGGAGACAAC TTOCCCTACC TGCTAGCATA 420 

(XAAGCCACA GIGTCOGCCA GGGCTAAGGC YCCACCTCCA TOGTGGGATC AAATCIGGAA 480 

GTOICTCATA OGGCTAAAGC CIAOGCIGCA OGGGCSAAOG OOOCIGCICT ATAO^CTAGG 540 

AGOOGTO CA G AATCAGCTCA COCTCACACA COCTATAACC AAA 583 

(2) INFORMATION PCR SBQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS : . 
& (A) LENGTH: 427 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS : double 

(D) TOPOLOGY: linear 



1$ 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

CCTCACCOCT GACCCCAOOG TOCOCXTIGC GOGGGCTGOG TGGGAGAGAG CTAGACACAC 60 

YCCAGTCAAC TOCTOGCTAG GCAACATCAT YATCTATCCG CCCACITTCT GGGCAAGGAT 120 

GATTCTGATC ACTCACTICr TCTOCATCCT TCTAGCCCAG GAGCAACTK AAAAAGOOCT 180 

GGAXICTCAA ATCTAOGGGG OCICTEAdC CAITCAGCCA CTIGACCEAC CTCAGATCAT 240 

TGAAOGACTC CATGGICITA GOGCATTTTC ACTCCATAGT TACICICCAG GIGAGATCAA 300 

TAGGGIGGCT TCATCCCTCA GGAAGCITOG GGTACCACCC TTGOGACTCT GGAGAGATOG 360 

GGOCAGAAGT CT00 GO GCIA AGCEACIGIC OCAPGGGGGG AGGGCOGCCA CTTCTCGCAA 420 

45 CTACCIC 427 

(2) INFORMATION FOR SEQ ID NO:48: 

so (i) SEQUENCE CHARACTERISTICS: 

(A) IZNGIH: 552 base pairs 

(B) TYPE: nucleic acid 

55 
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(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIFITCN: SBQ ID NO: 43: 

AGCOC»CTAG m i ' l UU m Ug OGAAAGGOCT TCTGCTACIG CCIGAIAGGG TGCITGCGAG 60 

TGCOCOGGGA GCTCTOCTAG AGOCTGCAYC ATGAGCACRA ATCCTAAACC YCAAARAAAA 120 

AMCAAAOGTA ACAOCAACOG TOGOOCACAG GACGIYAACT TCOOGGGYGG YGGTCAGATC 180 

GTYGGTGGftG TITACITCIT GCOGOGCAGG GGCCCYAGRT TGGCTCTGOG YGOGACKAGR 240 

15 AAGACITCCG AGCGCTOGCA ACCTCGWGGW AGKCGWCARC CTATCCCCAA GGCTCGYOGG 300 

OOOGAGGGCA GGAOCTQGGC TCAGCCYGGG TAYCCITGGC CCCTCTATGG CAATOAGGGC 360 

TKSGGOTGGG CRGGATGGCT CCTCTCWCCC OGYGGCTCIC GGCCIAGYIG GGGCOCCAMW 420 

GACCCCCGGC GTAGGrTOGOG YAATTTGGCT AAGCTCATCG ATACOCITAC RTCCGGCITC 480 

GCOGAOCTCA TGGGGEACAT WCOGCTYGTC GGOGCCCCYY TWGSaSGCGC TGOCAGGGOC 540 

25 CIGGCKCATO GY 552 

(2) INFORMATION FOR SBQ ID NO: 49: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) lEMGTIH: 580 base pairs 

(B) TYPE: nucleic acid 

(C) STRANEEDNESS : double 

(D) TOPOIOGY: linear 

35 

(xi) SEQUENCE DESCRIFITCN: SEQ ID NO: 49: 
WO0GCT0CTC GGOGCCCCYY TOGGRGGOGC TGCCAGGGCC CTGGCRCATC GYGTCOGGCT 60 
40 TCTGGARGAC GGOGTOAACT ATGCAACAGG GAAYYTKCCY GGTTCCTCIT TCICTATCIT 120 
OCTYYIGGCY CIGCTSTCYT GYTTGACYRT SCCMGCITCS GCYTAYSAAG TGCGCAAOCY 180 
SWCSGGGMIW TAOCAYGTCA CMAAYGAYTC CYCYAACTCR AGCATTCICT AYGAGGCGGC 240 
SGAYGYSAIC MTGCAYRCYC CSGGGTCOGT SCCYIGCGTCT CGKEAGRRCA AYKCCTCSMG 300 
KDGYTOGGIR GOGMTSACYC CYAOGSTSGC SRCCAGGRAT GSCARMSTCC CCKCKACQM 360 
RYTWOGAOGY CACRTOGAYY TCCTYGTYGG GASSGCVRCY YTCIGYTCSG CYMTSTAOGT 420 
GGGGGAYCIM TGOGGKTCIG TYITYCIYRT CKSCCARCTG TTCACCTTCT CKOCYMGSOG 480 
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SCAYKRGACR KYRCARGRYT GCAAYTGCTC WATCTATCCC GGCCAYREAW CRGGYCAYCG 540 
CATCGCWIGG GATATCATCA TGAACIGCTC SCCYACGRCR 580 



io 
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(2) INFORMATION FOR SBQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) I£N3TH: 351 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESO*IFTI0N: SBQ ID NO: 50: 



WIGGACATGR TSG 



ACHGOOCTSA ACIGCAATGA YWSCCTCMAN ACYGGSTK5Y TNSCM3SGCT K 



(2) INFORMATION FOR SBQ ID NO: 51: 



TCOGGATCOJ 


ACAAGCYRTC 


60 


GCMIVJGCSTA 


YWYTTCCAIG 


120 


TTCOOGGOCT 


YGAOGSGSAW 


180 


SAYTYRYKWS 


CCTCYTYRSA 


240 


GCACTIGGCA 


YMTCAAYAGS 


300 


TN30ESGCT 


K 


351 



(i) SEQCJENCS CHARACTERISTICS: 
35 (A) LENGIH: 583 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

CTCRGTCA3M GACTCYAAYA OTCICTCAC YCAGACRGTC GATtTCAGCY TKGAYCCYAC 60 

CTTCAOCATY GAGACRAYSA CSSTSCCCCA RGATGOCCTY TCSOGCACKC ARCGNOGRGG 120 

YAGGACTGGC AQGGGSARQ1 SAGGCATCTA YAGRITTGTC RCWCCRGGRG ARCGSCCCTC 180 

SGSSATCTTC GAYTCKTCSG TQdMICTGA CTGYTATCAC GCRGGCTCIG CITQCTATSA 240 

GCTCAOGCCC GCYGAGACYW CRGTIAGGYT RCGRGCKTAC MTRAAYACMC CRGGGYTKCC 300 



55 



41 



EP 0 939 128 A2 



OCTSTGCCAG GACCATCTKG ARTIYTGGGA GRQCXJIX.TIY ACAGGCCTCA CYCAYAIAGA 360 

YGCXCACTTY YTOTOCCAGA CWAAGCAGRS WGGRGASAAC YTYCCYTAOC TGGTAGOOA 420 

CCAAGOCACM CTCTGOGCYA GGGCIMARGC YCCWCCYCCA TCCTGGGAYC ARATCTGGAA 480 

GKTKTSAIW CGSdMAAGC CYACSCTSCA YGGGCCAACR CCOCTQCIOT AYAGRCIRGG 540 

JCCYCTYCAG AATGARRICA CCCISACPCA CCCWRIWACC AAA 583 



15 



25 



30 



(2) INFQEMMTCN FOR SEQ ID NO: 52: 



(i) SEQUENCE CHARACTERISTICS: 
(A) IfNGIH: 427 base pairs 
(8) TYPE: nucleic acid 
20 (C) STBANDEENESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

OCTCAOOOGT GACCCYACMR YCCCOCTYGC GMGRGCIGOG TCGGAGACAG CWAGAGACAC 60 

TCCACTCAAY TCCIGGCIAG GCAACMMAT CATOIWIGCS CCCACWYIGT GGGCKAGGAT 120 

GATWCTGATC ACYCAYTTCT T V TCCRTCCT TMTAGCCWRG GASCARCTIG AAMARGOCCT 180 

SGATIGYSAR ATCEAOGGGG CCTGYTACTC CATWGARCCA CITCAYCEAC CTCMRATCAT 240 

TSAAMSACTC CATGGYCTYA GOGCATnTC ACTOCAYACT TACICTOCAG CTGARATYAA 300 

35 TAGGCTGGCY KCATCCCTCA GRAARCITGG GGI7VCCRCCC TTGCGAGYYT GGAGACAYCG 360 

GGCCMSttGY GTCCGCGCTA RGCIWCIGKC CMRAGGRGGS AGGGOfGCCA YWICTGGCAA 420 

GTAOCTC 427 

40 



(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) UENGTH: 8865 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
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10 



1$ 



25 



ATCAGCAOGA ATCCIAAACC TCAAAAAAAA AACAAAOGTA ACACCAACOG TCGCCCAGAG 60 

GAOGTTCAACT TCCOGGCTGG CGCTCAGATC GTTGGTCGAG TITACT1U1T GCOGCGCAGG 120 

GGCCCTAGAT TCOffTOTOOG OGCGAGGAGA AAGACITCCG AGCGGTTOGCA ACCTOGAGCT 180 

AGAOCTCAGC CHMOCCCAA GGCTOCTOGG CCOGAGGGCA GGAGCIGGGC TCAGCCCGGG 240 

TAOCCITCGC WllTATC^ CAATCAGGGC TGOGGGTOGG CGGGATCGCT OCICTCTCCC 300 

OGTIGGCICIC GGCCTAGCIG GGGCCCCACA GACOCCOGGC GTAGGTOGOG CAATTIGGCT 360 

AAGGTCATOG ATACCCITAC GIGOGGCITC GCCGACCTCA TGGGCTACAT ACOGCTOGTC 420 

GGOGOCCCIC TTGGAGGCGC TGCCAGGGCC CTCGOGCATG GOCTCOGGCT TCTGGAAGAC 480 

GGOGTGAACT ATCCAACAGG GAACCTICCT GGITCCrCIT TCICEATCIT OdTCIGGOC 540 

50 ci u ai'icrr gctigacict gcccgcttog goctaccaag tqogcaactc caoggggctt 600 

TACCAOGTCA CCAATCATTC C3CCTAACTOG ACTATTCICT AOGAQGOQGC OGATGCCATC 660 

CTCCAGACTC OGGGGTTGOGT CQCTTGOGTT CCTGAGGGCA AOGCCTCGAG G7ICTTGGCTG 720 
GCGATCACCC CEAOGCTGGC CACCAGGGAT GGCAAACTCC COGOGAOGCA GCTIOGAOCT 780 
CACATOGATC TGCTICTOGG GftGOGOCACC CTCTCITCGG CCCTCIAOGT GGGGGACCIA 840 
30 TCOGGGTCIG ItaTllTlUT OGGCCAACTC TTCACCITCT CTCCCAGGOG CCACTGGAOG 900 
AOGCAAGGTT GCAATTSCTC TATCIATCCC GGCCATATAA CGGCTCAOOG CATGGCATGG 960 

GAIATCATCA TGAACIGGTC CCCTAOGACG GOGTTCGTAA TGGCPCAGCT GCTC0GGA1C 1020 

CCACAAGCCA TCITCGACAT GATCGCTGCT GCTCACTGGG GAGTCCTGGC GGGCAXAGOG 1080 

TATTICICCA TGCTGQQGAA CTGGGOGAAG GICCTOGTAG TGCTGCTGCT ATTIGCOGGC 1140 

40 CTCGACGOGG AAACCCACCT CACCGGGGGA AGTGCCGGCC ACACTCTCTC TSGkiTiwr 1200 

AGOCTOCTOG CAOCAGGOGC CAAGCAGAAC GICCAGCTCA TCAACACCAA CGGCAGrlTGG 1260 

OVOCTCAAIA GCAOGGOCCT GAACTGCAAT GATAGCCTCA ACACOGGCTG GTIGGCAGGG 1320 

<5 

T1T1UJM CA OCACAAGTTC AACICTTCAG GCICTCCIGA GAGGCTAGOC AGCTGCCGAC 1380 

CCCTEACOGA TTTTGAOCAG GGCTGGGGCC CTATCAGTTA TGCCAAOGGA AGOGGCCCOG 1440 

50 ACCAGOGCCC CTACIGCT G G CACTACCCCC CAAAACCTIG OGGTATICTG CCOGOGAAGA 1500 

GTCICTGT G G TCOGGTATAT TGdTCACIC CCAGCCCOGT GCTGGTCGGA ACGACCGACA 1560 
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GCTCGGGCGC GCXTACCTAC AGCIQGGGTIG AAAA1GATAC GGAGCTCITC GTCCTTAACA 1620 

A1ACCAGGCC ACOGCIGGGC AAHTGGTTCG GITCTACCIG GATCAACTCA ACTGGATTCA 1680 

5 

CCAAACTCTS CGGAGGQCCT OCTTCTCTCA TOQGAQGGGC GGGCAACAAC ACCCTGCACT 1740 

GCCCCACTGA TTOCITOOGC AAGCATCOGG ACXSCCACATA CICTOQGrlGC GGCTCOGGTC 1800 

,0 CCTQGATCAC AGOCAGGTOC CTGCTOGACT ACOOCTATAG GCTITOGCAT TATOCTIGTA 1860 

CCATCAACEA CAOCATAXIT AAAATCAGGA TGTTAOGTCGG AGGGGTOGAA CACAGGCTGG 1920 

AAGCTGOCT S CAACTGGAOG CGGGGOGAAC GITGOGATCT GGAAGACAGG GACAGGOCOG 1980 

A3CTCAGC0C GTEACTGCTG ACCACTACAC AGIGGCAGCT CCICCOCTCT TOCTTCACAA 2040 

COCTACCB3C CITCTOCAOC GGCCTCATCC ACCTCCACCA GAACATTCTG GAOG7IGCACT 2100 

20 AdTGTAOGG GGTOQQGTCA AGCATOGOCT CCTGGGCCAT TAAGPQGGAG TAOCTCXTriC 2160 

TUL I UITUL T TCTGCITGCA GACGOGCGCG TCTGCICCTG CTICTGGATC AIGCEACTCA 2220 

TATOCCAAGC GGAGGOGGCT TXGGAGAAOC TOGrTAATACT TAATCCAGCA TOOCTGGCOG 2280 

25 

GGAOGCAOGG TCTICTATCC TICCTOGTCT TCTIKJP GC IT TGCATCCTAT TIGAAGGGTA 2340 

ACTGGGTGCC OGGAGOGGTC TACACdTCT AOGGGATOIG GCCTCTCCIC CIGCTCCICT 2400 

30 TQGOGOTGCC CCAGOGGGCG TAOGOGCTGG ACAOGGAGGT GGOOGCCTCG TCTGGCGCTG 2460 

TICTTCTOCT OGGGTIGATG GOGCTGACTC TGTCACCATA TTACAAGOGC TAHATCAGCT 2520 

GGTCCITCIG GIGGCXTCAG TATITTCTCA CCAGACTGGA AGOGCAACTG CAOGTCTGGA 2580 

35 

TTOOOOOCCT CAAOCTOCGA GQGGGGOGOG AOGCCCTCAT CTTACTCATG TGTCCICTAC 2640 

ACCOGACTCT GGTEA3TIGAC ATCACCAAAT TGCTGCTGGC OGTCITOGGA LUJLTl'lGGA 2700 

40 TTCTTCAAGC CAGTITGCrr AAAGEACCCT ACTTTGTCOG OGTCCAAGGC CTICTO0GCT 2760 

TCTOOGOSTT AGOGOGGAAG ATGATOGGAG GOCATTAOGT GGAAATCGTC ATCAITAACT 2820 

TAGGGGOGCT TACTGGGACC TATCTTTATA ACCATCTCAC TCCICITOGG GACIGGGOGC 2880 

45 

ACAAOGGCIT GOGAGATCIG GCOGIGGCTG TAGAGCCAGT CGTCTTCTCC CAAATGGAGA 2940 

CCAAGCTCAX CAOCTGGGGG GCAGATACOG COGCCTGOGG TGACATCATC AAOGGCITGC 3000 

50 CICTTTQOGC OOGGAGGGGC CGGGAGATAC TGCTOGGGCC AGCCGATCGA ATCGTCTCCA 3060 

AGGGCTGGAG GITGCTGGOG CCCAICAOGG CGTAOGCCCA GCAGACAAGG GGCCICCTAG 3120 
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GGIGCATAAT CAQCAGCCIA ACTGGCOGGG ACAAAAACCA ACTGGAGGCT GAGGTCCAGA 3180 

TICTCTCAAC TGCIGCCCAA ACCITCCTGG CAAOCTGCAT CAAIGQGCTG TGCTGGACTG 3240 

TCXAOCAOGG GGCOGS\ACG AQGAOCATOG OGfTCACCCAA GGGTCCTOIC ATCCAGA3CT 3300 

AEAOCAATOT AGAOCAAGAC CTICTOGGCT GGCCOGCTCC GCAAGGTEAGC OGCTCAITGA 3360 

CACOCTGCAC TTGCGGCTCC TOQGAOCTTT AOCTGGTTCAC GAGGCAOGCC GATOICATTC 3420 

O0GTOOC3O0G GOQQQGT G AT AGCAGGGGCA GOCTGCTGTC GCOCCGGOCC A3TIOCEACT 3480 

TGAAAGGCIC CTOQQQQQCT OOGCTGTTGT GOOCOGOGGG GCAOGCCCTG GGCAIAJTEA 3540 

GGGOOGCGCT GTOCAOCOCT GGAGTQGCTA AGGOGGTOGA CTTEATCCCT CTGGAGAAOC 3600 

TAGAGACAAC CATCAGCTCC CXEGTCITCA CGGATAACTC CTCTCCACCA CTAG7IGCCCC 3660 

AGAGdTCCA GGrTGGCICAC CTOCATCCIC CCACAGGCAG OGGCAAAAGC AOCAAGCTOC 3720 

CX3GCTGCAIA TGCAGCTCAG GGCTATAAGG TGCEAGIACT CAAOCOCTCT GITGCTGCAA 3780 

CACTGGGCIT TQCTGCITAC ATCICCAAGG CICATCGGAT OGATCCTAAC ATCAGGACOG 3840 

GGGIGAGAAC AATTAfXACT GGCAGCOOCA TCACGIACIC CACCEAGGGC AAGTITCCITG 3900 

OOGAOGGOQG GIGCTOGQGG GGCGCITATC ACATAATAAT TTCIGAOGAG TOCCACTOCA 3960 

OGGAIGOCAC ATOCATCTIG GGCATOGGGA CICTCdTGA CCAAGCAGAG ACTGOGQGGG 4020 

OGMACTQGT TGTOCTOGCC ACCGCCAGCC CTOOGGGCTC CGICACTCTG CCOCATCCCA 4080 

ACATOGAGGA ( XS V VUC LXJS U TCCACCACOG GAGAGATCCC TTTTTAOGGC AAGGCTATOC 4140 

COCTOGAAGT AATCAAGGGG GGGAGACATC TCATCITCTG TCATTCAAAG AAGAACTGOG 4200 

AOGAACTOGC OGCAAAGCIG GTOGCA3TCG GCATCAATCC OCTGGCCEAC TACOGOGCTC 4260 

TIGAGGTCTC OCTCATCCOG ACCAGOGGCG ATCnTGTOCT OCTGGCAAOC GATGOCCICA 4320 

TGACOGGCTA TAOOGGOGAC TTOGACTOGG TCATAGACIG CAATACGTCT GICACOCAGA 4380 

CAGTOGATIT CAGOdTGAC CCTACCITCA OCATIGAGAC AATCAOGCTC CCCCAGGATC 4440 

CTOICrCOCXS CACTCAAOCT OGGGGCAGGA CTGGCAGGGG GAAGCCAGGC ATCTACAGAT 4500 

TTCTQGCACC GGGGGAGOGC CCCICOGGCA TCITOGACTC GTCCGTCCTC TGTCACTGCT 4560 

ATCAOGCAGG CJXJIG C TIUG TATCAGCTCA OGOC0GC0GA GACTACAGIT AGGCIAOGAG 4620 

OGTACATCAA CACCOOGGGG CTTCCOCTGrT GCCAGGACCA TCTTGAA3TT TGGGAGGGOG 4680 
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TCTnaCAGG CCTCACTCAT ATAGATCCCC ACITrcrATC CCAGACAAAG OYSACTGGGG 4740 

AGAAOCTT0C TEAOdGGIA GOCTACCAAG OCACOGTCTC OGCTAGGGCT CAAGC00CTC 4800 

OCOCKKCTG GGACCAGATC TGGAACTGrIT TGATTOGCCT CAAGCCCACC CT0CA3GGGC 4860 

CAACACOOCT GCEAIACAGA CTGGGOGdG TTCAGAATGA AATCACCCIG AOGCAGOCAG 4920 

w TCAOCAAA3A CATCATCACA TGCATCPOGG CCGACCTGGA GGKCTCAOG AGCAOCTGGG 4980 

TGCTOGITGG OQGCCTCCIG GCIGOTIGG COGOGTIATTG CCICTCAACA GGCTGOG TC G 5040 

TCAIACT3GG CAGGGTOCTC TTGTCCGGGA. AGCOGGCAAT CAIACCIGAC AGGGAACTCC 5100 

is 

TCTAOOGAGA CTTOGATCAG AIGGAAGAGT GCICTCAGCA CTEAOOGEAC ATOGAGCAAG 5160 

GGATGAIGCT CGCCGAGCAG TTCAAGCAGA AGGOOCPOGG CCTOCIGCAG ACOGaTTCCC 5220 

50 CTCAGGCAGA GCTTATOGCC OC1GCICT0C AGACCAACIG GCAAAAACIC GAGACCITCT 5280 

GGGOGAAGCA TATCTCGAAC TTCATCAGIG GGATACAATA CTTCGOSGGC TICTCAAOGC 5340 

TGOCTQGTAA COOCGCCATT GCTTCATIGA TCG LTITIA C AGCT GCT GTC ACCAGCOCAC 5400 

TAACCACTAG CCAAACCCTC CTCITCAACA TAITGGGGGG GIQGgTOGCT GOCCftGCTOG 5460 

QQaxm^ TGOOGCEACT GOCITTCTCG GOGCIGGCTT AGCIGGCGCC GCCATOGGCA 5520 

30 GICTIGGACT GOGGAAGGTC CTCATAGACA TCCITSCAGG GIATGGOGOG GGOGTCGCGG 5580 

GAGCTCTTCT GGCATTCAAG ATCATCAGOG GTCAGGTCCC CTOCAOGGAG GACCIQGTCA 5640 

ATCEACTGOC OGCCATOCTC TOGCCOGGAG CCCTOCTACT OGGQCTGSPC TGTCCAGCAA 5700 

35 

TMTOO300S GCAOGTIQGC OOGGGOGAGG GGGCACTGCA GIGGATCAAC OQGCTGAIAG 5760 

CCTTOGCCTC OOQGGGGAAC CAICTTrCCC CCAOGCACIA OGTCCOGGAG AGOSATCCAG 5820 

40 CTOCCOGOGT CACIGCCAIA CTCAGCAGCC TCACTGTAAC CCAGCTCCTG AGGCGACTCC 5880 

ACCAGTGGAT AAGCTOQGAG TGTEAOCACTC CATCCIXXX3G TTCCTGGCTA AGGGACATCT 5940 

GQGACTGGAT AXGQGAGG7IG TTGAGCGACT TTAAGAOCIG GCEAAAAGCT AAGCICATCC 6000 

45 CACAGCIGCC TGGGATCCCC TTICTCTOCT GCCAGOGCGG GTATAAGGGG GTCTGGOGAG 6060 

TOGAOGGCA TCATCCACAC TOGCTCCCAC TCTGGAGCIG AGATCACIGG AGATOTCAAAA 6120 

n AOGGGAOSAT GAGGATOCTC GGTOCTAGGA CCTCCAGGAA CATCIGGACT GGGACCITCC 6180 

OCA2TAA1GC CTACAOCACG GGCOCdCTA CCCCCCTTCC TGOGCOGAAC TACAOGOTOG 6240 
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OGCEATOGAG GGICTCTGCA GAGGAATATG TGGAGATAAG GGAGGIGGGG GACTTCCACT 6300 

AOGTCACQGG TATGACEACE GACAATCTCA AATCCCOGIG CCAGGTCCCA TOGCCOGAAT 6360 

TITTCACAGA ATIGGAOGGG GIGCGCCEAC ATAGGITIGC GCOCCCCIGC AAGCOCTIGC 6420 

TGOQGGAGGA GGEATCATTC AGAGEAGGAC TCCAOGAAIA CCOGGEAGGG TOGCAATEAC 6480 

,0 CTTOOGAGOC 0GAAC03GAC GIGGCCCTGT TGACGTCCAT GCTCACIGAT OCCTCCXMA 6540 

TAACAGCAGA GGOG GC OGGG OGAAGGITGG CEAGGGCMC ACCCCCCTCT GIGGCCAGCE 6600 

CCTOQGCEAG CCAGCEATCC GCTCCATCIC TCAAGGCAAC TIGCACOGCT AACCATGACE 6660 

OCCCTGATCC TGAGCTCAXA GAGGCCAAOC TCCEAIGGAG GCAGGAGA1G GGCQGCAACA 6720 

TCAOCAGGGT TGACTCAGAA AACAAAGIGG TGAITCIGGA CiCaTUGAT OOGCITCTGG 6780 

20 CGGAGGAGGA OSAGOGGGAG ATCTOCGrEAC CCGCAGAAAT CCIGOGGAAG TCTOGGAGAT 6840 

TOGOOCAGGC CCIGCOOGTE TQGGOGCGGC OGGACEAXAA CCOCCCGCEA GTCGAGAOGT 6900 

QGAAAAAGCC CEACEAOGAA CCACCIGIGG TOCATOGCIG TCOGCTTCCA CCTCCAAAGT 6960 

25 COOCIOCICT GOCTCOGCCT OGGAAGAAGC GGAOGGTCGT CCTCACIGAA TCAACCCEAT 7020 

CTACIGCCIT GGOOGAGCTC GCCACCAGAA GCITTCGCAG CTCCICAACT TOOGGCATEA 7080 

OGGGOGACAA TAOGACAACA TOCTCT G AGC OOGCOCCITC TGGdGCOOC CCOSVCTCOG 7140 

30 

AOGCTGAGTC CTATICCTCC ATCCCCCOCC TGGAGGGGGA GCCIGGGGAT CCGGATCTEA 7200 

GOGACGGGTC AIGGTCAAOG GECAGTAGTC AGGCCAAOGC GGAGGATOIC GTCIGCIGCE 7260 

35 CAATOTCTEA CTCTIGGAGA GGOGCACTOG TCAOCCOGIG OGCOGOGGAA GAACAGAAAC 7320 

TGOOCATCAA IGCACEAAGC AACTCCTIGC TAOGTCACCA CAATTIGGTC TA3T0CAGCA 7380 

OCTCAOGCAG TGCTIGCCAA AGGCAGAAGA AAGTCACAIT TGACAGACIG CAACTICIGG 7440 

40 

ACAGCCATEA CCAGGAOGTA CTCAAGGAGG TEAAAGCAGC GGOGTCAAAA GIGAAGGCEA 7500 

ACTIGCEATC CGEAGAGGAA GCTIGCAGCC TCAOGCCCCC ACACTCAGCC AAATCCAAGT 7560 

45 TIGGTEATCG GGCAAAAGAC GTCCGTIGCC ATGCCAGAAA GGCOGEAACC CACATCAACT 7620 

COGIGTOGAA AGACCITCTG GAAGACAATG TAACACCAAT AGACACTAOC ATCAIGGCEA 7680 

AGAACGAGCT TTTCIGOGTE CAGQCTGAGA AGGGGGGTOG TAAGCCAGCT OGTCTCATOG 7740 

50 

TGITCCCOGA TCTGGGCGIG OGOGIGIGOG AAAAGATCGC TTTGTAOGAC GIGGTEACAA 7800 
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AGCTCOCCIT GGOCGIGATC GGAAGCTCCT AOGGA1TCCA AXACTCAGCA GGACAGOGGG 7860 

TIGAATICCT OTGCAAGaG TGGAAGTCCA AGAAAACCCC AATCGGOITC T0G7EATGAXA 7920 

OJU taC l UC I T TGACTCCACA GTCACTGAGA GGGACATCOG TAOGGAGGAG GCAATCTAOC 7980 

AATCITCIGA OCIOGACOOC CAAGCCCGCG TCGCCATCAA CTCCCTCACC GAGAGGCTTT 8040 

ATOITOGGGG OOCTCmOC AATTCAAGGG GGGAGAACIG OGGCIATOGC AGCTGCOGOG 8100 

OGAGCG G OCT ACIGACAACT AGCICTGCTA ACACXXTCAC TTCCIACATC AAGGGOOGGG 8160 

CAGCCIGTOG AGCOGCAGGG CTOCAGGACT GCACCATGCT OCTGICTGGC GAOGACTTAG 8220 

l UaT KIC lU TGAAAGCEOG GGGGTCCAGG AGGAOGGGGC GAGCCIGAGA GCCITCAOGG 8280 

AGGCIA1GAC CAGCTACTCC GCCCCCCCTG GGGACCCCCC ACAACCAGAA TAOGACITGG 8340 

AGCTCKIAAG ATCATCCTCC TCCAACGICT CACTOGCXXA GGAOGGOGCT GGAAAGAGGG 8400 

TCTWOaOCT CACCOGIGAC CCEACAAOCC CCCTO G OGftG AGCTGCCTGG GAGACAGCAA 8460 

GACACACTOC AGTCAATTOC TCGCIAGGCA ACATAATCAT GTITGCCOOC ACACTCTGGG 8520 

OSAGGATGAT ACIGATCACC O KITIITITA GCCTCCTCAT AGCCAGGGAC CAGCTTGAAC 8580 

AGGOCCTOGA TTGCGAGATC TAOGGGGOCT GCEACTCCAT AGAACCACIT GATCTACCTC 8640 

CAATCAXTCA AAGACTCCAT GGOCTCAGOG CATTITCACr CX^CAGITAC TCTCCAGGIG 8700 

AAAJTAAIAG GCT3GCOGCA TCCCTCAGAA AACITGGGGT ACCGCCCITG CGAGCTIGGA 8760 

GACAOOGGGC OOGGAGOGTC OGCGCIAGGC TTCIGGCCAG AGGAGGCAGG GCTGCCAIAT 8820 

CTQGCAAGIA CCTCITCAAC TCGGCACTAA GAACAAAGCT CAAAC 8865 

(2) INTOFMAXIQN FOR SBQ ID NO: 54: 

(i) SEQUENCE CHARACHRISTOCS : 
(A) LENGTH: 367 base pairs 
(8) TYPE: nucleic acid 

(C) STRANDEENESS: double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

YW5CCICAAM ACVGGSTKKY TKGCCGSGCT KITCTAYMMM CACAAGITCA ACKCKTOTGG 60 

MICTCCKGAG MSSMTRGCCA GCTGY0C3WC CMITRMCAAK TTYGACCAGG QCGGGGYCC 120 
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YATCASYTAT GCYMAMSSWR RCHRCYCSGA CCAGMGSOCS TAYTGCTGGC ACTACSCMDC 180 

5 WMRACMKIGY GGTAIYCTRC C0GCGWM34R KGTCTGYGCT COCTKTATT GCTTCACYCC 240 

MAGCCCYGTK GTCIGGGRA OSfcCOGAYMS KTYSGGCGCS CCYACSTAYA RCIGGGGKGA 300 

MAAIGAKAOG GAOGT5YTSS TCCIWAACAA YACSMGGCCM COGCWSGGCA AYTCGITCGG 360 

YTGPTACA 3b7 

(2) INFORMATION FOR SBQ ID NO:55: 

,5 (i) SEQUENCE OiAFACTERISTTCS: 

(A) IfNGTH: 1249 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESOUFITON: SBQ ID NO: 55: 
WOQ3CTCCTC GGOGOOCCYY TWGGRGGOGC TGCCAGGGCC CTGGCRCATC GYCTCOGGCT 60 
2S TCTGGARGAC GGCCTGAACT ATGCAACAGG GAAYYTKCCY GCTIGCTCTr TCTCEATCIT 120 

CCTYYTGGCT CTGCTSTCYT GYTTGACYRT SOCMGCTTCS GCYTAYSAAG TGCGCAACKY 180 
SWCSGGQOW TAOCAYGTCA CMAAYGAYTG CYCYAACTGR AGYATTGTICT AYGAGGCGGC 240 

30 

SGAYGYSATC MIGCAYRCYC CSGGCTGOCT SCCYIGOGTT CGKGAGRRCA AYKOCTCSMG 300 
KTGYTQGGTA GCGMTSACYC CYAOGSTSGC SRCCAGGRAT GSCABMSTCC CCRCKA03M 360 

35 RYTWOGAGGY CACRTOGAYY TGCTYCTYGG GASSGCYRCY YTCIGYTCSG CYMTSTAOCT 420 

GGGGGAYC3M TCOGGOTCTC TVTTYCTYRT CKSOCARCIG TTCACCTTCT CKCCYMSSOG 480 
SCAYKRGACR RYPCARGRYT GCAAYTGCIC WATCTATCCC GGGCAYREAW CR3GYCAYCG 540 

40 CATGGCWPGG GAXATGATGA TCAACTQCTC SCCYAOGRCR GCSTIPCTRR TGKCKCAGYT 600 

RCTCOGGATC CCACAAGCYR TCWTOGACAT GRISGCKGGK GCYCACTGGG GACTCCITCC 660 
GGGCMIWGCS TAYTWYTOCA TGGTGGGGAA CTCGGCKAAG GTYYTGKTWG TGMIGCTRCT 720 

45 

MTriGOaSGC CTYGAOGSGS AWACCCRCCT SACSGGGGGR RKKSTOGGCC ACRYYRYSTC 780 
TRSAYTYRYK WSOCTCYTYR SACCWGGSGC SWMSCAGAAM RTYCAGCTKR TMAACACCAA 840 
50 YGGCAGITGG CAYMICAAYA GSACW30CCT GAACIGCAAT GAYWSCCTCM AMACYGGSTK 900 
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w 



15 



30 



35 



SYTKGOGSG CTKCTCTAYM MMCACAAGrTT CAACKCOTCM GQGGYOCWS A3GSMTRGC 960 

CAGCIGYCGM YCCMnWER AKTTYGACCA GGOGGGGGY CCYATCASYT ATGCYMAMSS 1020 

WRRCKRCYCS GACCAQCSC CSTAYTGCIG GCACTACSCM CCWMRACMKT GYGG7EATYGT 1080 

FCC0G0GWM3 MRK7ICTGYG GTCCRGIKrA TTGCTTCACY CCMAGCCCYG TKCTCTGGG 1140 

RACGACCGAT MGKTYSGGOG CSOCYACSTA YARCIGGGGK GAMAAIGAKA OGGAOGISYT 1200 

SSTOCIWAAC AAYACSM3GC CMCOSCWSGG CAAYTGGITC GGYTCTACA 1249 

(2) INFOraffiTICN FOR SBQ ID NO: 56: 



(i) SEQUENCE CHARACTERISTICS: 

(A) UWGIH: 278 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

25 TGGGCAAYTG GrTTOGGYTCT ACMIGGATGA AYWSMACTGG RITCAGCAAR RYCTGCGGAG 60 

SSCCYOCKIG TKWCATOGGR GGGGYSGGCA ACAACACCYT Q*E3GC00C ACK3AYTGCT 120 

TCOGSAAGWM YCOGPMSGCC ACYTACWCWM RRTGYGGYTC SGGYCCYTOG WISACAGCYA 180 

GGflGCTTGCT YGACTACOCR TAYAGGCTYT GGCAYTAYOC YTGYACYOTC AACIWYACCA 240 

TMTTYAARRT YAGGAOTEAY GIGGGRGGSG TSGARCAC 278 



(2) INFORMATION FOR SBQ ID NO: 57: 



(i) SEQUENCE CHARACTERISTICS: 

(A) IZNGIH: 1539 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDECNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

45 ACACTGGGCT TTCGIGCIWA YATCTCCAAG GCWCATGGSA YOGAYCCYAA CATCAGRACY 60 

GGGGTRAG*A CMATYACCAC WGGYPSCCCC ATYAOGTACT CCACCTAYSG CAAGITCCIT 120 

GCOGAOGGYG GKTOCTCSGG GGGOGCYTAT GACATMATAA TTTCTGAOGA GIGCCACTOC 180 

50 

ACGGAIGCCA CATCCATCIT GGGCATCGGC ACTCTCCTTC ACCAAGCAGA GACIGCGGGG 240 
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GCGAGACTCG TTCIGCTOGC CACOGCCACC CCTCOGGGCT COGTCACIGT GCCCCATOCC 300 

5 AACATOGAGG AGGITCCICT GTCCACCACC GGAGAGATCC CITnTACGG CAARRSYATC 360 

CCCMICGABG YMATCAAGGG GOGRAGRCAT CTCATCITCr GYCATTCMAA GAAGAAGTCY 420 

GftOGARCIOG COGCAAAGCT GKYMGOttTS GGMMICAATG COGTGGCSTA YEACOGOGGT 480 

CTTOAXGICT OOGICKIMCC RACYAGCGGM GAYGIYGTCYG TOGIGGCAAC MGAXGCCCTC 540 

MXaCOGGCT AXACOGGCGA CTTCGACTOG GTGATAGACT GCAATAOSTC TCTCACCCAG 600 

AOGICGATT TCAGCCITCA OOCDVCCriC ACCATTCAGA CAATCADGCT CCCCCAGGAT 660 

GCTCICTCCC GCACICAAOG TOQGGGCAGG ACTGGCAGGG GGAAGCCAGG CATCIACAGA 720 

TITCIGGCAC CGGGGGAGCG OCCCFOOGGC ATCITOGACT OTOXCTCCT CICTGACTGC 780 

20 TATCACGCAG GCTGTGCITG GIATCAGCTC A0GCC0GCOG AGACIACAGT TAGGCTAOGA 840 

GCGIACATCA ACACCOOGGG GCTTCCOGIG TGCCAGGACC ATCTTCAATT TIGGGAGGGC 900 
GICTTCACAG GCCTCACTCA TATAGATGCC CACTTTCTAT CCCAGACAAA GCAGAGTGGG 960 

25 

GAGAAGCTTC CTTAOCTOCT AGOCTAjCCAA GCCACCGTCT GOGCTAGGGC TCAAGOCCCT 1020 

CCCCCATOGT GGGACCAGAX GIGGAAGICT TIGATTCGOC TCAAGCCCAC CCTCCATCGG 1080 

30 OCAACACCCC TCdKTACAG ACIGGGCGCT CTTCAGAATG AAATCACCCT GAOGCAOOCA 1140 

CTCAOCAAAT ACATCATCAC ATCCATCPOG GCCEACCIGG AGGTOCTCAC GAGCACCTGG 1200 

GTCCKCTIG GOGGOCTOCT GGCTCCITIG GCOGCGTATT GOCTOTCAAC AGGCIGCCTG 1260 

35 

GTCAIAGIGG GCAGGGTOGT CTICTCOGGG AAGCCGGCAA TCATACCIGA CAGGGAACTC 1320 

CTCEACCGAG ACTTOGATCA GATCGAAGAG TGCKCYYMRC AOTMCCSTA CATOGARCAR 1380 

40 GGRAOTW3C TOGOOGAGCA RTTCAAGCAG AAGGCSCTCG GSYTSCTGCA RACMGCSWCC 1440 

MRKCAH3CR3 AGGYTKYYGC YCCKKSTGWS YMRAYSMACK SSYMRAAACT OGAGACCTTC 1500 

TGGGCGAAGC AIATGTGGAA CITCATCACT GGGATACAA 1539 



45 



50 



(2) INFORMATION FOR SBQ ID NO:58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) I£NGIH: 341 base pairs 

(B) TYPE: nucleic acid 

(C) ST3RANDECNESS: double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 58: 

GVT3GGCAAT TQCTTOGGIT GYAOCTGGAT GAACTCAWCT GGAITYACCA AAGICTGCGG 60 

AGOGOCTCCT TGICTCATOG GAGGGGYGGG CAACAACACC YTGCMGX3CC OCACTGAYTG 120 

TTTCCGCAAG CAXOOQGAOG CCACATACTC TCGCTGCGGY TCCGGTCCCT GGA3YACRCC 180 

CAGCTGCCTG GTCSACTAOC CKEAXftGGCT TTGGCATTAT CCVTGTACTR TCAACIACAC 240 

CWIKETYAAA FTCAGGATCT AOGTQGGAGG GGTCGARCAC AGGCIGGAAG YTGCCTGCAA 300 

CTGGAOGOGG QGOGAROGTT GYGATCIGGA MGAGAGGGAC A 341 

(2) INPORMAHON FOR SBQ ID NO:59: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENG7IH: 293 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 59: 

ATGAGCACRA ATCCTAAACC TCAAARAAAA AMCAAACCTA ACACCAACOG YCGCCCACAG 60 

GACCTCAACT TCCOGGGYGG YGGTCAGATC GTTGGTGGAG TTTACVTGTT GCCGOGCAGG 120 

QGCOCYAGRT TGGGTICTGOS OGOGACKAGR AAGAdTCOG AGOGGTOGCA ACCTOGWGGW 180 

ACRCGWCARC CIATCOCCAA GGCTOGYCHG CCCGAGGGCA GGRCCIGGGC TCAGCCCGGG 240 

TACCCITGGC CXXTCTATGG CAAYGAGGGC WKSGGGTGGG CRGGATGGCT OCT 293 



Claims 

1 . A polynucleotide in substantially isolated form comprising a nucleotide sequence of at least 1 5 nucleotides from a 
J-7 HCV isolate, said J-7 HCV isolate having at least 90% nucleotide sequence homology with the J-7 sequence 
of Figure 1 or 6. wherein said nucleotide sequence of at least 15 nucleotides is distinct from the nucleotide 
sequence of HCV isolate HCV-1 as shown in Figure 12. 

2. A polynucleotide according to claim 1 wherein the J-7 HCV isolate has at least 95% homology with the J-7 HCV 
sequence of of Figure 1 or 6. 

3. A polynucleotide according to claim 1 wherein the J-7 HCV isolate has 1 00% homology with the J-7 HCV sequence 
of Figure 1 or 6. 

4. A polynucleotide according to any one of the preceding claims which comprises at least 20 nucleotides. 

5. A method of detecting HCV polynucleotides in a test sample comprising: 

(a) providing a polynucleotide as defined in any one of claims 1 to 4 as a probe; 

(b) contacting the test sample and the probe under conditions that allow for the formation of a polynucleotide 
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duplex between the probe and its complement in the absence of substantial polynucleotide duplex formation 
between the probe and non-HCV polynucleotide sequences present in test sample; and 
(c) detecting any polynucleotide duplexes comprising the probe. 

6. A polynucleotide comprising a sequence of at least 15 nucleotides from a J-7 HCV isolate present in any one of 
plasmids pSl -8791a. bU1-1216c. bU1-4652d. pSl-713c pS7-28c, pSM519. TC-600BR JH-400BP, AW-300BP. 
AW-770-BP-N. AW-700BP-C/ AW-700BP-N or J1 5-1-1 deposited under accession numbers BP-2593, BP-2594. 
BP-2595, BP-2637, BP-2638. BP-3081, 68333. 68394. 68392. 68395 and 40884 respectively, wherein said nucle- 
otide sequence of at least 1 5 nucleotides is distinct from the nucleotide sequence of HCV isolate HCV-1 . 

7. A purified polypeptide comprising a amino acid sequence which: 

(a) is encoded by a nucleotide sequence as defined in any one of claims 1 to 4 or in the HCV sequences 
deposited and defined in claim 6. said coding being in frame with the corresponding amino acid sequences set 
out in Figures 1 and 6. 

(b) comprises an antigenic determinant; and 

(c) is oistinct from the sequence of the polypeptides encoded by the HCV isolate HCV-1 . 

8. A polypeptide according to claim 7 which comprises at least 1 0 amino acids. 

9. A polypeptide according to claim 7 which comprises at least 15 amino acids. 

10. A polypeptide according to any one of claims 7 to 9 immobilised on a solid support 

1 1 . An immunoassay for detecting the presence of anti-HCV antibodies in a test sample which comprises: 

(a) incubating the test sample under conditions that allow the formation of an antigen-antibody complex to be 
formed with a polypeptide as defined in any one of claims 7 to 10. wherein the polypeptide is not immunologi- 
cally cross-reactive with HCV-1 ; and 

(b) detecting any antigen antibody co mp lexes formed. 

12. An immunoassay according to daim 1 1 wherein the test sample comprises human blood or a fraction thereof. 
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J7 1 AGCCGAGTAGTGTTGGGTCGCGAAACGCCTTGTGGT 

discrepancy 
clone 

altered aa 



J7 37 ACTGCCT6ATAGGGTGCTT6C6AGT6CCCCGGGAGG 



Met Ser Thr Asn 

J7 73 TCTCGTAGACCGTGCATC ATG AGC ACA AAT 



Pro Lys Pro Gin Arg Lye Thr Lys Arg 
J7 103 CCT AAA CCC CAA A6A AAA ACC AAA CGT 

T 6 
b 1 

Arg 



Asn Thr Asn Arg Arg Pro Gin Asp Val 

J7 130 AAC ACC AAC CGT CGC CCA CAG GAC GTT 

C 
b 



Lys Phe Pro Gly 

J7 157 AAG TTC CCG GGC 

T 
1 

Leu 



Gly Gly Gin He Val 

GGT GGT CAG ATC GTC 

T 
b 



Gly Gly Val Tyr Leu Leu Pro Arg Arg 

J7 184 GGT GGA GTT TAC TTG TTG CCG CGC AGG 

A 
b 



Gly Pro Arg Leu Gly Val Arg Ala Thr 
J7 211 GGC CCC AGG TTG GGT GTG CGT GCG ACT 

FIG. 1-1 
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J7 



Arg Lys Thr Ser Glu Arg Ser Gin Pro 
238 AGG AAG ACT TCC GAG CGG TCG CAA CCT 

A 
b 



J7 



Arg Gly Arg Arg Gin Pro He Pro Lys 

265 CGT GGA AGG CGA CAA CCT ATC CCC AAG 



J7 



292 



Ala Arg Arg Pro Glu Gly Arg Thr Trp 

GCT CGC CGG CCC GAG GGC AGG ACC TGG 



J7 



Ala Gin Pro Gly Tyr Pro Trp Pro Leu 

319 GCT CAG CCT GGG TAT CCT TGG CCC CTC 



J7 



Tyr Gly Asn Glu Gly Leu Gly Trp Ala 

346 TAT GGC AAT GAG GGC TTG GGG TGG GCA 

A 

b 

END 



J7 



373 



Gly Trp Leu Leu Ser Pro 

GGA TGG CTC CTG TCA CCC 



Arg Gly Ser 

CGC GGC TCT 



J7 



Arg Pro Ser Trp Gly Pro Asn Asp Pro 

400 CGG CCT AGT TGG GGC CCC AAT GAC CCC 

T C 
C b 

Thr 



J7 



Arg Arg Arg Ser Arg Asn Leu Gly Lys 

427 CGG CGT AGG TCG CGT AAT TTG GGT AAG 



J7 



Val He Asp Thr Leu Thr Cys Gly Phe 

454 GTC ATC GAT ACC CTT ACA TGC GGC TTC 

C 
1 

Leu 



FIG. 1-2 
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J7 



Ala Asp Leu Nat Gly Tyr lie Pro Leu 

481 GCC GAC CTC ATG GGG TAC ATT CC6 CTT 

C C 
c b 



J7 



508 



Val Gly Ala Pro Leu Gly Gly Ala Ala 

GTC GGC GCC CCC TTA GGG GGC GCT GCC 



J7 



Arg Ala Leu Ala His Gly 

AGG GCC CTG CCA CAT GGT 

FIG. 1-3 
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Pro Leu Val Gly Ala Pro Leu Gly Gly 

Jl IT CCG CTC GTC 66C GCC CCC TTA GGG GGC 

discrepancy C 
clone d 
altered aa Ser 



Ala Ala Arg Ala Leu Ala His Gly Val 

29 GCT GCC AGG GCC CTG GCA CAT GGT GTC 



Arg Val Leu Glu Asp Gly Val Asn Tyr 

56 CGG GTT CTG GAG GAC GGC GTG AAC TAT 



Ala Thr Gly Asn Leu Pro Gly Cys Ser 

83 GCA ACA GGG AAT TTG CCC GGT TGC TCT 



Pha ser lie Phe Leu Leu Ala Leu Leu 
110 TTC TCT ATC TTC CTC TTG GCT CTG CTG 

A T 

g d 



Ser Cys Leu Thr lie Pro Ala Ser Ala 

137 TCC TGT TTG ACC ATC CCA GCT TCC GCT 



Tyr Glu Val Arg Asn Val Ser Gly He 

164 TAT GAA GTG CGC AAC GTG TCC GGG ATA 



Tyr His Val Thr Asn Asp Cys Ser Asn 
191 TAC CAT GTC ACA AAC GAC TGC TCC AAC 

T 
d 



Ser ser He Val Tyr Glu Ala Ala Asp 
218 TCA AGC ATT GTG TAT GAG GCG GCG GAC 

FIG. 2-1 
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245 



Val lie Met His Ala Pro Gly cya Val 

6T6 ATC ATG CAT GCC CCC GGG TGC GTG 



272 



Pro Cys Val Arg Clu Asn Asn Ser ser 

CCC TGC GTT CGG GAG AAC AAT TCC TCC 

C 

d 



299 



Arg Cys Trp Val Ala Lou Thr Pro Thr 

CGT TGC TGG GTA GCG CTC ACT CCC ACG 



326 



Lau Ala Ala Arg Asn Ala Sar Val Pro 

CTC GCG GCC AGG AAT GCC AGC GTC CCC 



Thr Thr Thr Lau 

ACT ACG ACA TTA 

G 
d 



Arg His Val Asp 
CGC CAC GTC GAC 



380 



Lau Leu Val Gly Thr Ala Ala Phe Cys 

TTG CTC GTT GGG ACG GCT GCT TTC TGC 



407 



Ser Ala Met Tyr Val Gly Asp Leu cys 

TCC GCT ATG TAC GTG GGG GAT CTC TGC 



434 



Gly Ser Val Phe Leu He Ser Gin Leu 

GGA TCT GTT TTC CTC ATC TCC CAG CTG 

T 
d 



461 



Phe Thr Phe Ser Pro Arg 

TTC ACC TTC TCG CCT CGC 

FIG. 2-2 



His Glu 
CAT GAG 
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Thr Val Gin Asp Cys Asn cys Ser lie 

488 ACA GTA CAG GAC TGC AAC TGC TCA ATC 



Tyr Pro Gly His Val Ser Gly His Arg 

515 TAT CCC GGC CAC GTA TCA GGC CAT CGC 

T 
c 



Met Ala Tzp Asp Met Net Met Asn Tip 

542 ATG GCT TGG GAT ATG ATG ATG AAC TGG 



Ser Pro Thr Ala 

569 TCG CCC ACG GCA 

RG. 2-3 
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Asn Trp Ser Pro Thr 

Jl 1 AAC TGG TC6 CCC ACG 

discrepancy 
clone 
altered aa 



Ala Ala Leu Val Val Ser Gin Leu Leu 

Jl 16 GCA 6CC TTA GTG GTG TCG CAG TTA CTC 

111 



Arg lie Fro Gin Ala Val Met Asp Met 

Jl 43 CGG ATC CCA CAA GCT GTC ATG GAC ATG 



Val Ala Gly Ala His Trp Gly Val Leu 

Jl 70 GTG GCG GGG GCC CAC TGG GGA GTC CTA 

G 

■ 

1 



Ala Gly Leu Ala Tyr Tyr Ser Met Val 

Jl 97 GCG GGC CTT GCC TAC TAT TCC ATG GTG 

A 

i 



Gly Asn Trp Ala Lys Val Leu He Val 

Jl 124 GGG AAC TGG GCT AAG GTT TTG ATT GTG 



Met Leu Leu Phe Ala Gly Val Asp Gly 

Jl 151 ATG CTA CTC TTT GCC GGC GTT GAC GGG 



His Thr Arg Val Thr Gly Gly Val Gin 

Jl 178 CAT ACC CGC GTG ACG GGG GGG GTG CAA 

AG A 

gg i 



FIG. 3-1 
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Gly His Val Thr ser Thr Leu Thr Ser 

Jl 205 GGC CAC GTC ACC TCT ACA CTC ACS TCC 

T 6 
C i 

Ala 



Jl 232 



Leu Phe Arg Pro Gly Ala Ser Gin Lys 

CTC TTT AGA CCT GGG GCG TCC CAG AAA 



Jl 



lie Gin Leu Val Asn Thr Asn Gly Ser 

ATT CAG CTT GTA AAC ACC AAT GGC AGT 

TC T 
ii i 

Ser Leu 



Trp His lie Asn Arg Thr Ala Leu Asn 

Jl 286 TGG CAT ATC AAC AGG ACT GCC CTG AAC 

T 

g 



Jl 313 



Cys Asn Asp Ser Leu Gin Thr Gly Phe 

TGC AAT GAC TCC CTC CAA ACT GGG TTC 



Jl 340 



Leu Ala Ala Leu 

CTT GCC GCG CTG 

FIG. 3-2 
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Jl 1 C TCA 

discrepancy 
clone 

altered aa 



Val lie Asp Cys Asn Thr Cys Val Thr 

Jl 5 6T6 ATC GAC TGT AAC ACA TGT GTC ACT 



Gin Thr Val Asp Phe Ser Leu Asp Pro 

Jl 32 CAG ACG GTC GAT TTC AGC TTG GAT CCC 



Thr Phe Thr lie Glu Thr Thr Thr Val 

Jl 59 ACC TTC ACC ATC GAG ACG ACG ACC GTG 

G 

c 

Ala 



Pro Gin Asp Ala Val Ser Arg Thr Gin 

Jl 86 CCC CAA GAT GCG GTT TCG CGC ACG CAG 



Arg Arg Gly Arg Thr Gly Arg Gly Arg 

Jl 113 CGG CGA GGT AGG ACT GGC AGG GGC AGG 

* 

Arg Gly lie Tyr Arg Phe Val Thr Pro 
Jl 140 AGA GGC ATC TAT AGG TTT GTG ACT CCA 



Gly Glu Arg Pro Ser Ala Met Phe Asp 

Jl 167 GGA GAA CGG CCC TCG GCG ATG TTC GAT 



Ser Ser Val Leu Cys Glu cys Tyr Asp 

Jl 194 TCT TCG GTC CTA TGT GAG TGT TAT GAC 



Ala Gly Cys Ala Trp Tyr Glu Leu Thr 

Jl 221 GCG GGC TGT GOT TGG TAT GAG CTC ACG 

A 

e 

Gly (-) 



FIG. 4-1 
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Jl 



Pro Ala Glu Thr Ser Val Arg Leu Arg 

248 CCC GCT GAG ACC TCG GTT AGG TTG CGG 



Jl 



Ala Tyr Leu Asn Thr Pro Gly Leu Pro 
275 GCT TAC CTA AAT ACA CCA GGG TTG CCC 



Jl 



302 



Val Cys Gin Asp His Leu Glu Phe Tip 

GTC TGC CAG GAC CAT CTG GAG TTC TGG 



Jl 



Glu Ser Val Phe Thr Gly Leu Thr His 

GAG AGC GTC TTC ACA GGC CTC ACC CAC 



Jl 



356 



Zle Asp Ala His Phe Leu Ser Gin Thr 

ATA GAC GCC CAC TTC TTG TCC CAG ACT 



Jl 



383 



Lys Gin Ala Gly Asp Asn Phe Pro Tyr 

AAG CAG GCA GGA GAC AAC TTC CCC TAC 



Jl 



410 



Leu Val Ala Tyr Gin Ala Thr Val Cys 

CTG GTA GCA TAC CAA GCC ACA GTG TGC 



Jl 



Ala Arg Ala Lys Ala Pro Pro Pro Ser 

437 GCC AGG GCT AAG GCT CCA CCT CCA TCG 

C 



Ala(-) 



Jl 



464 



Trp Asp Gin Met Trp Lys Cys Leu Zle 

TGG GAT CAA ATG TGG AAG TGT CTC ATA 



Jl 



Arg Leu Lys Pro Thr Leu His Gly Pro 

491 CGG CTA AAG CCT ACG CTG CAC GGG CCA 

G 

e 
Ala 



FIG. 4-2 
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Jl 



Thr Pro Leu Leu Tyr Arg Leu Gly Ala 
518 AC6 CCC CTG CTG TAT AGG CTA G6A 6CC 

A 



Arg(-) 



Jl 



545 



Val Gin Asn Glu Val Thr Leu Thr His 

GTC CAG AAT GAG GTC ACC CTC ACA CAC 



Jl 



572 



Pro He Thr Lys 

CCT ATA ACC AAA 



FIG. 4-3 



64 



EP0 939128 A2 



Leu Thr 

Jl 1 C CTC ACC 

discrepancy 

clone 

altered aa 



Arg Asp Pro Tor Val Pro Leu Ala Arg 

Jl 8 CGT 6AC CCC ACC GTC CCC CTT GCG CGG 



Ala Ala Trp Glu Thr Ala Arg His Thr 

Jl 35 GCT GCG TGG GAG ACA GCT AGA CAC ACT 

C 

g 

Thr(-) 



Pro Val Asn Ser Trp Leu Gly Asn lie 

Jl 62 CCA GTC AAC TCC TGG CTA GGC AAC ATC 



lie Met Tyr Ala Pro Thr Leu Trp Ala 

Jl 89 ATC ATG TAT GCG CCC ACT TTG TGG GCA 

T 

g 

Ile(-) 



Arg Net Zle Leu Met Thr His Phe Phe 

Jl 116 AGG ATG ATT CTG ATG ACT CAC TTC TTC 



Ser lie Leu Leu Ala Gin Glu Gin Leu 

Jl 143 TCC ATC CTT CTA GCC GAG GAG CAA CTT 



Glu Lys Ala Leu Asp Cys Gin lie Tyr 

Jl 170 GAA AAA GCC CTG GAT TGT CAA ATC TAC 



Gly Ala cys Tyr Ser lie Glu Pro Leu 

Jl 197 GGG GCC TGT TAC TCC ATT GAG CCA CTT 



FIG. 5- 1 
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Asp Leu Pro Gin lie lie Glu Arg Leu 

Jl 224 6AC CTA CCT CAG ATC ATT GAA CGA CTC 



His Gly Leu Ser Ala Phe Ser Leu His 
Jl 251 CAT GGT CTT AGC GCA TTT TCA CTC CAT 



Ser Tyr Ser Pro Gly Glu lie Asn Arg 
Jl 278 AGT TAC TCT CCA GGT GAG ATC AAT AGG 



Val Ala Ser Cys Leu Arg Lys Leu Gly 

Jl 305 GTG GCT TCA TGC CTC AGG AAG CTT GGG 



Val Pro Pro Leu Arg Val Trp Arg His 

Jl 332 GTA CCA CCC TTG CGA GTC TGG AGA CAT 



Arg Ala Arg Ser Val Arg Ala Lys Leu 

Jl 359 CGG GCC AGA AGT GTC CGC GCT AAG CTA 



Leu Ser Gin Gly Gly Arg Ala Ala Thr 

Jl 386 CTG TCC CAA GGG GGG AGG GCC GCC ACT 

G 

g 

Gln(o) 



Lys Gly Lys Tyr Leu 

Jl 413 TGT GGC AAG TAC CTC 

FIG. 5-2 
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J7 
HCV1 



AGCCGAGTAGTGTTGGGTCGCGAAAGGCCTTGTGGT 



J7 
HCV1 



37 ACTGCCTGATA6G6T6GTTGCGA6TGCCCCG6GAG6 



J7 
HCV1 



73 



Met Ser Thr Asn 

TCTC6TA6ACCGT6CATC AT6 AGC AGA AAT 

C 6 



J7 
HCV1 



Pro Lys Pro Gin Arg Lys Thr Lys Arg 

103 CCT AAA CCC CAA AGA AAA ACC AAA CGT 

T A A 

Lys Asn 



*** 



J7 
HCV1 



130 



Asn Thr Asn Arg 

AAC ACC AAC CGT 



Pro Gin Asp Val 

CCA CAG GAC GTT 

C 



J7 
HCV1 



Lys Pha 

157 AAG TTC 



Gly Gly Gly Gin He Val 
GGC GGT GGT CAG ATC GTC 
T C T 



J7 
HCV1 



Gly Gly Val Tyr Leu Leu Pro Arg Arg 
184 GGT GGA GTT TAC TTG TTG CCG CGC AGG 



J7 
HCV1 



Gly Pro Arg Leu Gly Val Arg Ala Thr 

211 GGC CCC AGG TTG GGT GTG CGT GCG ACT 

T A C G 



J7 
HCV1 



Arg Lys Thr Ser Glu Arg Ser Gin Pro 

238 AGG AAG ACT TCC GAG CGG TCG CAA CCT 
A 



FIG. 6-1 
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Arg Gly Arg Arg Gin Pro II© Pro Lya 

J7 265 CGT GGA AGG CGA CAA CCT ATC CCC AAG 
HCV1 A T A T G 



Ala Arg Arg Pro Glu Gly Arg Thr Trp 

J7 292 GCT CGC CGG CCC GAG GGC AGG ACC TGG 
HCV1 T 



Ala Gin Pro Gly Tyr Pro Trp Pro Leu 

J7 319 GCT CAG CCT GGG TAT CCT TGG CCC CTC 
HCV1 C C 



Tyr Gly Asn Glu Gly Leu Gly Trp Ala 

J7 346 TAT GGC AAT GAG GGC TTG GGG TGG GCA 
HCV1 GC G 

Cys 



Gly Trp Leu Leu Ser Pro Arg Gly Ser 

J7 373 GGA TGG CTC CTG TCA CCC CGC GGC TCT 
HCV1 T T 



Arg Pro Ser Trp Gly Pro Asn Asp Pro 

J7 400 CGG CCT AGT TGG GGC CCC AAT GAC CCC 
HCV1 C CA 

Thr 



Arg Arg Arg Ser Arg Asn Leu Gly Lys 

J7 427 CGG CGT AGG TCG CGT AAT TTG GGT AAG 
HCV1 C 



Val lie Asp Thr Leu Thr Cys Gly Phe 

J7 454 GTC ATC GAT ACC CTT ACA TGC GGC TTC 
HCV1 G 



FIG 6-2 
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HCV1 



Ala Asp Leu Met Gly Tyr lie Pro Leu 
J7 481 GCC GAC CTC ATG GGG TAC ATT CCG CTT 

A C 



Val Gly Ala Pro Leu Gly Gly Ala Ala 

J7 508 GTC GGC GCC CCC TTA GGG GGC GCT GCC 
HCV1 T C T A 

Arg Ala Leu Ala His Gly 

J7 535 AGG GCC CTG GCA CAT GGT 
HCV1 s c 

RG. 6-3 
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Jl 



Pro Leu Val Gly Ala Pro Leu Gly Gly 

1 T CCG CTC GTC GGC GCC CCC TTA GGG GGC 
A T C T A 



Jl 



Ala Ala Arg Ala Leu Ala His Gly Val 

29 GCT GCC AGG GCC CTG GCA CAT GGT GTC 

G C 



Jl 



Arg Val Leu Glu Asp Gly Val Asn Tyr 

56 CGG GTT CTG GAG GAC GGC GTG AAC TAT 

A 



Jl 



Ala Thr Gly Asn Leu Pro Gly Cys Ser 

83 GCA ACA GGG AAT TTG CCC GGT TGC TCT 

C C T T 



Jl 



Phe Ser He Phe Leu Leu Ala Leu Leu 

110 TTC TCT ATC TTC CTC TTG GCT CTG CTG 

T C C C 



Jl 



Ser Cys Leu Thr He Pro Ala Ser Ala 

137 TCC TGT TTG ACC ATC CCA GCT TCC GCT 
TC TGGC GC 

val 



Jl 



Tyr Glu Val Arg Asn Val Ser Gly He 

164 TAT GAA GTG CGC AAC GTG TCC GGG ATA 
C C TCC AG C T 

Gin ser Thr Leu 



*** 



Jl 



Tyr His Val Thr Asn Asp cys Ser Asn 

191 TAC CAT GTC ACA AAC GAC TGC TCC AAC 

C C T T C T 
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Ser Ser lie Val Tyr Glu Ala Ala Asp 

218 TCA A6C ATT 6T6 TAT GAG GCG GCG GAC 
G T C C T 



Val lie Met His Ala Pro Gly Cys Val 

245 GTG ATC ATG CAT GCC CCC GGG TGC GTG 

CC C C A T G C 

Ala Leu Thr 



Pro Cys Val Arg Glu Asn Asn Ser Ser 

272 CCC TGC GTT CGG GAG AAC AAT TCC TCC 

T T GG C G G 

Gly Ala 
*** 



Arg Cys Trp Val Ala Leu Thr Pro Thr 

299 CGT TGC TGG GTA GCG CTC ACT CCC ACG 
AG T G AG C T 

Met 



Leu Ala Ala Arg Asn Ala Ser Val 

326 CTC GCG GCC AGG AAT GCC AGC GTC 
GGCA G 6AAC 

Val Thr Asp Gly Lys Leu 



*** 



Thr Thr Thr Leu Arg Arg His Val Asp 

353 ACT ACG ACA TTA CGA CGC CAC GTC GAC 
G G CAG C T T A T 

Ala Gin He 



Leu Leu Val Gly Thr Ala Ala Phe Cys 

380 TTG CTC GTT GGG ACG GCT GCT TTC TGC 
C TC GCCACC T 

Ser Thr Leu 
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Sex: 


Ala Met Tyr Val 


GlV 


Asp 




Jl 


407 


TCC 

A W W 

6 


GCT AT6 TAC 
C C C 

Leu 


GTG 




SAT 

c 


A 






Gly 


Ser Val Phe 


Leu 


He 


Ser 


Gin 


Jl 


434 


GGA 


TCT GTT TTC 


CTC 


ATC 


TCC 


CAG 






6 


C T 


T 


G 

Val 


GG 

Gly 


A 






Phe 


Thr Phe Ser 


Pro 


Arg 


Arg 

CGG 


His 


Jl 


461 


TTC 


ACC TTC TCG 


CCT 


CGC 


CAT 








T 


C 


A G 


c 


C 



TG 
Trp 



Thr Val Gin Asp Cys Asn Cys Ser He 

Jl 488 ACA GTA CAG GAC TGC AAC TGC TCA ATC 

G ACG A GT T T 

Thr Gly 
*** *** 



Tyr Pro Gly His Val Ser Gly Bis Arg 

Jl 515 TAT CCC GGC CAC GTA TCA GGC CAT CGC 

T A AG T C 
lie Thr 



Net Ala Trp Asp Met Met Met Asn Trp 
Jl 542 ATG GCT TGG GAT ATG ATG ATG AAC TGG 

A 



Ser Pro Thr Ala 

Jl 569 TCG CCC ACG GCA 

C T AG 

Thr 
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Asn Trp Ser Pro Thr Ala 

Jl 1 AAC T6G TCG CCC AC6 GCA 
HCV1 C T AG 

Thr 



Ala Leu Val Val Ser Gin Leu Leu Arg 

Jl 19 GCC TTA GTG GTG TCG CAG TTA CTC CGG 
HCV1 G G AA GT CG 

Met Ala 



lie Pro Gin Ala Val Met Asp Met Val 
Jl 46 ATC CCA CAA GCT GTC ATG GAC ATG GTG 
HCV1 CAT AC 

He Leu He 



Ala Gly Ala His Trp Gly Val Leu Ala 

Jl 73 GCG GGG GCC CAC TGG GGA GTC CTA GCG 
HCV1 T T T G 



Gly Leu Ala Tyr Tyr Ser Met Val Gly 

Jl 100 GGC CTT GCC TAC TAT TCC ATG GTG GGG 
HCV1 A A G T TC 

He Phe 



Asn Trp Ala Lys Val Leu He Val Met 

Jl 127 AAC TGG GCT AAG GTT TTG ATT GTG ATG 
HCV1 G C C G A C 

Val Leu 



Leu Leu Phe Ala Gly Val Asp Gly His 

Jl 154 CTA CTC TTT GCC GGC GTT GAC GGG CAT 
HCV1 G A C C G A 

Ala Glu 



Thr Arg Val Thr Gly Gly Val Gin Gly 

Jl 181 ACC CGC GTG ACG GGG GGG GTG CAA GGC 

HCV1 ACC A AGT GCC 

His ser Ala 

*** #•# 
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Hi8 

Jl 208 CAC 
HCV1 



Val Thr ser Thr 

6TC ACC TCT ACA 
ACT GTG GG 

Thr val Gly 

*** *** 



Leu Thr Ser Leu 
CTC ACG TCC CTC 
T T GTT AG 

Fhe Val 
#** 



Phe Arg Pro 

Jl 235 TTT AGA CCT 

HCV1 C C GC A 

Leu Ala 
*** 



Gly Ala Ser Gin Lye lie 

GGG GCG TCC CAG AAA ATT 
C C AAG C G C 

Lys Asn val 

*** *** 



Gin Leu Val Asn Thr Asn Gly Ser Trp 

Jl 262 CAG CTT GTA AAC ACC AAT GGC AGT TGG 
HCV1 G A C C 

lie 



His lie Asn Arg Thr Ala Leu Asn Cys 

Jl 289 CAT ATC AAC AGG ACT GCC CTG AAC TGC 
HCV1 C C T C G 

Leu ser 



Asn Asp Ser Leu Gin Thr Gly Phe Leu 

Jl 316 AAT GAC TCC CTC CAA ACT GGG TTC CTT 
HCV1 TAG AC C C GG T G 



Asn Trp 



Ala Ala Leu 

Jl 343 GCC GCG CTG 

HCV1 AG T 

Gly 
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Ser Val lie 

Jl 1 C TGA GTG ATC 

HCV1 ggctataccggcgacttcga 6 A 



Aap Cys Asn Thr Cys Val Thr Gin Thr 

Jl 11 GAC TGT AAC ACA TGT GTC ACT CAG ACG 

HCV1 C T G C A 



Val Asp Phe Ser Leu Aap Pro Thr Phe 

Jl 38 GTC GAT TTC AGC TTG GAT CCC ACC TTC 

HCV1 C T C T 



Thr He Glu Thr Thr Thr Val Pro Gin 

Jl 65 ACC ATC GAG ACG ACG ACC GTG CCC CAA 

HCV1 T A TC G C C G 



Asp Ala Val Ser Arg Thr Gin Arg Arg 

Jl 92 GAT GCG GTT TCG CGC ACG CAG CGG CGA 

HCV1 T C C T A T G 



Gly Arg Thr Gly Arg Gly Arg Arg Gly 
Jl 119 GGT AGG ACT GGC AGG GGC AGG AGA GGC 

HCV1 C G A CC 

Lys PEfl 



He Tyr Arg Phe Val Thr Pro Gly Glu 
Jl 146 ATC TAT AGG TTT GTG ACT CCA GGA GAA 

HCV1 C A G A G G G 

Ala 



Arg Pro Ser Ala Met Phe Asp Ser Ser 

Jl 173 CGG CCC TCG GCG ATG TTC GAT TCT TCG 

HCV1 C C GC CGC 

Gly 
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Val Leu Cys clu Cys Tyr Asp Ala Gly 

Jl 200 GTC CTA TGT GAG TGT TAT GAC GCG GGC 

HCV1 CCA 



Cys Ala Trp Tyr Glu Leu Thr Pro Ala 

Jl 227 TGT GCT TGG TAT GAG CTC ACG CCC GCT 

HCV1 c 



Glu Thr ser Val Arg Leu Arg Ala Tyr 

Jl 254 GAG ACC TCG GTT AGG TTG CGG GCT TAC 

HCV1 T A A C A A G 

Thr 



Leu Asn Thr Pro Gly Leu Pro Val Cys 

Jl 281 CTA AAT ACA CCA GGG TTG CCC GTC TGC 

HCV1 AGCCG CT G 

Net 



Gin Asp His Leu Glu Phe Trp Glu Ser 

Jl 308 CAG GAC CAT CTG GAG TTC TGG GAG AGC 

HCV1 TAT G 

Gly 



Val Phe Thr Gly Leu Thr His lie Asp 

Jl 335 GTC TTC ACA GGC CTC ACC CAC ATA GAC 

HCV1 T T T T 



Ala His Phe Leu Ser Gin Thr Lys Gin 

Jl 362 GCC CAC TTC TTG TCC CAG ACT AAG CAG 

HCV1 T C A A 



Ala Gly Asp Asn Phe Pro Tyr Leu Val 

Jl 389 GCA GGA GAC AAC TTC CCC TAC CTG GTA 

HCV1 AGT G G C T T 

Ser Glu Leu 
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Ala Tyr Gin Ala Thr Val Cys Ala Arg 

Jl 416 GCA TAC CAA GCC ACA GTG TGC GCC AGG 

HCV1 G C T 



Ala Lys Ala Pro Pro Pro Ser Trp Asp 

Jl 443 OCT AAG GCT CCA CCT CCA TCG TGG GAT 

HCV1 C A C T C C 

sin 



Gin Met Trp Lys cys Leu He Arg Leu 

Jl 470 CAA ATG TGG AAG TGT CTC ATA CGG CTA 

HCV1 G T G T C C 



Lys Pro Thr Leu His Gly Pro Thr Pro 

Jl 497 AAG CCT ACG CTG CAC GGG CCA ACG CCC 

HCV1 C C C T A 



Leu Leu Tyr Arg Leu Gly Ala Val Gin 

Jl 524 CTG CTG TAT AGG CTA GGA GCC GTC CAG 

HCV1 A C A G C T T 



Asn Glu Val Thr Leu Thr His Pro lie 

Jl 551 AAT GAG GTC ACC CTC ACA CAC CCT ATA 

HCV1 A A G G A G C 

He Val 



Thr Lys 

Jl 578 ACC AAA 

HCV1 tacatcatgacatgcatgtc 
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Leu Thr 

Jl 1 C CTC ACC 

HCVl 



Arg Asp Pro Thr Val Pro Leu Ala Arg 

Jl 8 CGT GAC CCC ACC GTC CCC CTT GCG CGG 
HCV1 T A AC C A A 

Thr 



Ala Ala Trp Glu Thr Ala Arg His Thr 

Jl 35 GCT GCG TGG GAG ACA GCT AGA CAC ACT 
HCV1 A 



Pro Val Asn Ser Trp Leu Gly Asn He 

Jl 62 CCA GTC AAC TCC TGG CTA GGC AAC ATC 
HCV1 T A 



lie Met Tyr Ala Pro Thr Leu Trp Ala 

Jl 89 ATC ATG TAT GCG CCC ACT TTG TGG GCA 
HCV1 T C AC G 

Phe 



Arg Met lie Leu Met Thr His Phe Phe 

Jl 116 AGG ATG ATT CTG ATG ACT CAC TTC TTC 
HCV1 • ■ A C T T 



Ser He Leu Leu Ala Gin Glu Gin Leu 

Jl 143 TCC ATC CTT CTA GCC CAG GAG CAA CTT 
HCV1 AG G A AG C G 

Val He Arg Asp 



Glu Lys Ala Leu Asp Cys Gin He Tyr 

Jl 170 GAA AAA GCC CTG GAT TGT CAA ATC TAC 
HCV1 C G C CGG 

Gin Glu 
*** *** 
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Gly Ala Cya Tyr Ser lie Glu Pro Leu 
Jl 197 GGG GCC TGT TAC TCC ATT GAG CCA CTT 
HCV1 C A A 



Asp Leu Pro Gin lie lie Glu Arg Leu 
Jl 224 GAC CTA CCT CAG ATC ATT GAA CGA CTC 
HCV1 T CA C A 

Pro Gin 
*** *** 



His Gly Leu Ser Ala Phe Ser Leu His 

Jl 251 CAT GGT CTT AGC GCA TTT TCA CTC CAT 
HCV1 C C C 



Ser Tyr Ser Pro Gly Glu lie Asn Arg 
Jl 278 ACT TAC TCT CCA GGT GAG ATC AAT AGG 
HCV1 A T 



Val Ala Ser Cys Leu Arg Lys Leu Gly 

Jl 305 GTG GCT TCA TGC CTC AGG AAG CTT GGG 
HCV1 C G A A 

Ala 



Val Pro Pro Leu Arg val Trp Arg His 

Jl 332 GTA CCA CCC TTG CGA GTC TGG AGA CAT 
HCV1 G CT C 

Ala 

*** 



Arg Ala Arg Ser Val Arg Ala Lys Leu 
Jl 359 CGG GCC AGA AGT GTC CGC GCT AAG CTA 
HCV1 CGC G T 

Arg 



Leu Ser Gin Gly Gly Arg Ala Ala Thr 

Jl 386 CTG TCC CAA GGG GGG AGG GCC GCC ACT 
HCV1 G AG A C T TA 

Ala Arg lie 
*** *** 
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-267 GCGTCTAGCCATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGG 
CGCAGATCGGTACCGCAATCATACTCACAGCACGTCGGAGGTCC 



-223 ACCCCCCCTCCCGGGAGAGCCATAGTGGTCTGCGGAACCGGTGA 
TGGGGGGGAGGGCCCTCTCGGTATCACCAGACGCCTTGGCCACT 



-179 GTAGACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGATCAA 
CATGTGGCCTTAACGGTCCTGCTGGCCCAGGAAAGAACCTAGTT 



-135 CCCGCTCAATGCCTGGAGATTTGGGCGTGCCCCCGCAAGACTGC 
GGGCGAGTTACGGACCTCTAAACCCGCACGGGGGCGTTCTGACG 



-91 TAGCCGAGTAGTGTTGGGTCGCGAAAGGCCTTGTGGTACTGCCT 
ATCGGCTCATCACAACCCAGCGCTTTCCGGAACACCATGACGGA 



47 GATAGGGTGCTTGCGAGTGCCCCGGGAGGTCTCGTAGACCGTGC 
CTATCCCACGAACGCTCACGGGGCCCTCCAGAGCATCTGGCACG 



-3 ACC -1 
TGG 



Met Ser Thr Asn Pro Lys Pro Gin Lys Lys Asn 

1 ATG AGC ACG AAT CCT AAA CCT CAA AAA AAA AAC 
TAC TCG TGC TTA GGA TTT GGA GTT TTT TTT TTG 



Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp Val 

34 AAA CGT AAC ACC AAC CGT CGC CCA CAG GAC GTC 
TTT GCA TTG TGG TTG GCA GCG GGT GTC CTG CAG 



Lys Phe Pro Gly Gly Gly Gin lie Val Gly Gly 

67 AAG TTC CCG GGT GGC GGT CAG ATC GTT GGT GGA 
TTC AAG GGC CCA CCG CCA GTC TAG CAA CCA CCT 
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Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu 

100 GTT TAC TTG TTG CCG CGC AGG GGC CCT AGA TTG 
CAA ATG AAC AAC GGC GCG TCC CCG GGA TCT AAC 



Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg 

133 GGT GTG CGC GCG ACG AGA AAG ACT TCC GAG CGG 
CCA CAC GCG CGC TGC TCT TTC TGA AGG CTC GCC 



Ser Gin Pro Arg Gly Arg Arg Gin Pro He Pro 

166 TCG CAA CCT CGA GGT AGA CGT CAG CCT ATC CCC 
AGC GTT GGA GCT CCA TCT GCA GTC GGA TAG GGG 



Lys Ala Arg Arg Pro Glu Gly Arg Thr Tip Ala 

199 AAG GCT CGT CGG CCC GAG GGC AGG ACC TGG GCT 
TTC CGA GCA GCC GGG CTC CCG TCC TGG ACC CGA 



Gin Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn 

232 CAG CCC GGG TAC CCT TGG CCC CTC TAT GGC AAT 
GTC GGG CCC ATG GGA ACC GGG GAG ATA CCG TTA 



Glu Gly cys Gly Trp Ala Gly Trp Leu Leu Ser 

265 GAG GGC TGC GGG TGG GCG GGA TGG CTC CTG TCT 
CTC CCG ACG CCC ACC CGC CCT ACC GAG GAC AGA 



Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr 
298 CCC CGT GGC TCT CGG CCT AGC TGG GGC CCC ACA 
GGG GCA CCG AGA GCC GGA TCG ACC CCG GGG TGT 



Asp Pro Arg Arg Arg Ser Arg Asn Leu Gly Lys 

331 GAC CCC CGG CGT AGG TCG CGC AAT TTG GGT AAG 
CTG GGG GCC GCA TCC AGC GCG TTA AAC CCA TTC 



Val lie Asp Thr Leu Thr Cys Gly Phe Ala Asp 

364 GTC ATC GAT ACC CTT ACG TGC GGC TTC GCC GAC 
CAG TAG CTA TGG GAA TGC ACG CCG AAG CGG CTG 
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Leu Met Gly Tyr lie Pro Lou Val Gly Ala Pro 

397 CTC ATG GGG TAC ATA CCG CTC GTC GGC GCC CCT 
GAG TAC CCC ATG TAT GGC GAG CAG CCG CGG GGA 



Leu Gly Gly Ala Ala Arg Ala Leu Ala His Gly 

430 CTT GGA GGC GCT GCC AGG GCC CTG GCG CAT GGC 
GAA CCT CCG CGA CGG TCC CGG GAC CGC GTA CCG 



Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 

463 GTC CGG GTT CTG GAA GAC GGC GTG AAC TAT GGA 
CAG GCC GAA GAC CTT CTG CCG CAC TTG ATA CGT 



Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He 

496 ACA GGG AAC CTT CCT GGT TGC TCT TTC TCT ATC 
TGT CCC TTG GAA GGA CCA ACG AGA AAG AGA TAG 



Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val 

529 TTC CTT CTG GCC CTG CTC TCT TGC TTG ACT GTG 
AAG GAA GAC CGG GAC GAG AGA ACG AAC TGA CAC 



Pro Ala Ser Ala Tyr Gin Val Arg Asn Ser Thr 

562 CCC GCT TCG GCC TAC CAA GTG CGC AAC TCC ACG 
GGG CGA AGC CGG ATG GTT CAC GCG TTG AGG TGC 



Gly Leu Tyr His Val Thr Asn Asp Cys Pro Asn 
595 GGG CTT TAC CAC GTC ACC AAT GAT TGC CCT AAC 
CCC GAA ATG GTG CAG TGG TTA CTA ACG GGA TTG 



Ser ser lie Val Tyr Glu Ala Ala Asp Ala He 

628 TCG AGT ATT GTG TAC GAG GCG GCC GAT GCC ATC 
AGC TCA TAA CAC ATG CTC CGC CGG CTA CGG TAG 



Leu His Thr Pro Gly cys Val Pro cys Val Arg 

661 CTG CAC ACT CCG GGG TGC GTC CCT TGC GTT CGT 
GAC GTG TGA GGC CCC ACG CAG GGA ACG CAA GCA 
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Glu Gly Asn Ala Ser Arg Cys Trp Val Ala Mat 

694 GAG GGC AAC GCC TCG AGG TGT TGG GTG GOG ATG 
CTC CCG TTG CGG AGC TCC ACA ACC CAC CGC TAC 



Thr Pro Thr val Ala Thr Arg Asp Gly Lys Leu 

727 ACC CCT ACG GTG GCC ACC AGG GAT GGC AAA CTC 
TGG GGA TGC CAC CGG TGG TCC CTA CCG TTT GAG 



Pro Ala Thr Gin Leu Arg Arg His He Asp Leu 

760 CCC GCG ACG CAG CTT CGA CGT CAC ATC GAT CTG 
GGG CGC TGC GTC GAA GCT GCA GTG TAG CTA GAC 



Leu Val Gly Ser Ala Thr Leu Cys Ser Ala Leu 

793 CTT GTC GGG AGC GCC ACC CTC TGT TCG GCC CTC 
GAA CAG CCC TCG CGG TGG GAG ACA AGC CGG GAG 



Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu 

826 TAC GTG GGG GAC CTA TGC GGG TCT GTC TTT CTT 
ATG CAC CCC CTG GAT ACG CCC AGA CAG AAA GAA 



Val Gly Gin Leu Phe Thr Phe Ser Pro Arg Arg 
859 GTC GGC CAA CTG TTC ACC TTC TCT CCC AGG CGC 
CAG CCG GTT GAC AAG TGG AAG AGA GGG TCC GCG 



His Trp Thr Thr Gin Gly Cys Asn Cys Ser He 

892 CAC TGG ACG ACG CAA GGT TGC AAT TGC TCT ATC 
GTG ACC TGC TGC GTT CCA ACG TTA ACG AGA TAG 



Tyr Pro Gly His He Thr Gly His Arg Met Ala 

925 TAT CCC GGC CAT ATA ACG GGT CAC CGC ATG GCA 
ATA GGG CCG GTA TAT TGC CCA GTG GCG TAC CGT 



Trp Asp Met Met Met Asn Trp Ser Pro Thr Thr 

958 TGG GAT ATG ATG ATG AAC TGG TCC CCT ACG ACG 
ACC CTA TAC TAC TAC TTG ACC AGG GGA TGC TGC 
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Ala Leu Val Met Ala Gin Leu Leu Arg He Pro 

991 GCG TTG GTA ATG GCT CAG CTG CTC CGG ATC CCA 
CGC AAC CAT TAC CGA GTC GAC GAG GCC TAG GGT 



Gin Ala Zle Leu Asp Met He Ala Gly Ala His 

1024 CAA GCC ATC TTG GAC ATG ATC GCT GGT GCT CAC 
GTT CGG TAG AAC CTG TAC TAG CGA CCA CGA GTG 



Trp Gly Val Leu Ala Gly He Ala Tyr Phe ser 

10S7 TGG GGA GTC CTG GCG GGC ATA GCG TAT TTC TCC 
ACC CCT CAG GAC CGC CCG TAT CGC ATA AAG AGG 



Met Val Gly Asn Trp Ala Lys Val Leu Val Val 

1090 ATG GTG GGG AAC TGG GCG AAG GTC CTG GTA GTG 
TAC CAC CCC TTG ACC CGC TTC CAG GAC CAT CAC 



Leu Leu Leu Phe Ala Gly Val Asp Ala Glu Thr 

1123 CTG CTG CTA TTT GCC GGC GTC GAC GCG GAA ACC 
GAC GAC GAT AAA CGG CCG CAG CTG CGC CTT TGG 



His Val Thr Gly Gly Ser Ala Gly His Thr Val 

1156 CAC GTC ACC GGG GGA AGT GCC GGC CAC ACT GTG 
GTG CAG TGG CCC CCT TCA CGG CCG GTG TGA CAC 



Ser Gly Phe Val Ser Leu Leu Ala Pro Gly Ala 
1189 TCT GGA TTT GTT AGC CTC CTC GCA CCA GGC GCC 
AGA CCT AAA CAA TCG GAG GAG CGT GGT CCG CGG 



Lys Gin Asn Val Gin Leu Zle Asn Thr Asn Gly 

1222 AAG CAG AAC GTC CAG CTG ATC AAC ACC AAC GGC 
TTC GTC TTG CAG GTC GAC TAG TTG TGG TTG CCG 



Ser Trp His Leu Asn Ser Thr Ala Leu Asn Cys 

1255 AGT TGG CAC CTC AAT AGC ACG GCC CTG AAC TGC 
TCA ACC GTG GAG TTA TCG TGC CGG GAC TTG ACG 
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Asn Asp Ser Leu Asn Thr Gly Trp Leu Ala Gly 
1288 AAT GAT AGC CTC AAC ACC GGC TGG TTG GCA GGG 
TTA CTA TCG GAG TTG TGG CCG ACC AAC CGT CCC 



Lau Pha Tyr His His Lya Pha Asn Ser Sar Gly 
1321 TTT TCT ATC ACC ACA AGT TCA ACT CTT CAG GCT 
GAA AAG ATA GTG GTG TTC AAG TTG AGA AGT CCG 



Cys Pro Glu Arg Lau Ala Sar Cys Arg Pro Lau 

1354 CTC CTC AGA GGC TAG CCA GCT CCC GAC CCC CTT 
ACA GGA CTC TCC GAT CGG TCG ACG GCT GGG GAA 



Thr Asp Pha Asp Gin Gly Trp Gly Pro Zla Sar 

1387 ACC GAT TTT GAC CAG GGC TGG GGC CCT ATC AGT 
TGG CTA AAA CTG GTC CCG ACC CCG GGA TAG TCA 



Tyr Ala Asn Gly Sar Gly Pro Asp Gin Arg Pro 

1420 TAT GCC AAC GGA AGC GGC CCC GAC CAG CGC CCC 
ATA CGG TTG CCT TCG CCG GGG CTG GTC GCG GGG 



Tyr Cys Trp His Tyr Pro Pro Lys Pro Cys Gly 

1453 TAC TGC TGG CAC TAC CCC CCA AAA CCT TGC GGT 
ATC ACG ACC GTG ATC GGG GGT TTT GGA ACG CCA 



lie Val Pro Ala Lys Sar Val Cys Gly Pro Val 
1486 ATT GTG CCC GCG AAG AGT GTG TGT GGT CCG GTA 
TAA CAC GGG CGC TTC TCA CAC ACA CCA GGC CAT 



Tyr Cys Pha Thr Pro Sar Pro Val Val Val Gly 

1519 TAT TGC TTC ACT CCC AGC CCC GTG GTG GTG GGA 
ATA ACG AAG TGA GGG TCG GGG CAC CAC CAC CCT 



Thr Thr Asp Arg Sar Gly Ala Pro Thr Tyr Sar 

1552 ACG ACC GAC AGG TCG GGC GCG CCC ACC TAC AGC 
TGC TGG CTG TCC AGC CCG CGC GGG TGG ATC TCG 
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Trp Gly Glu Asn Asp Thr Asp Val Phe Val Leu 

1585 TGG GOT GAA AAT GAT ACG GAC GTC TTC GTC CTT 
ACC CCA CTT TTA CTA TGC CTG CAG AAG CAG GAA 



Asn Asn Thr Arg Pro Pro Leu Gly Asn Trp Phe 

1618 AAC AAT ACC AGG CCA CCG CTG GGC AAT TGG TTC 
TTG TTA TGG TCC GGT GGC GAC CCG TTA ACC AAG 



Gly Cys Thr Trp Met Asn Ser Thr Gly Phe Thr 

1651 GGT TGT ACC TGG ATG AAC TCA ACT GGA TTC ACC 
CCA ACA TGG ACC TAC TTG ACT TGA CCT AAG TGG 



Lys Val Cys Gly Ala Pro Pro Cys Val lie Gly 

1684 AAA GTG TGC GGA GCG CCT CCT TGT GTC ATC GGA 
TTT CAC ACG CCT CGC GGA GGA ACA CAG TAG CCT 



Gly Ala Gly Asn Asn Thr Leu His Cys Pro Thr 

1717 GGG GCG GGC AAC AAC ACC CTG CAC TGC CCC ACT 
CCC CGC CCG TTG TTG TGG GAC GTG ACG GGG TGA 



Asp Cys Phe Arg Lys His Pro Asp Ala Thr Tyr 

1750 GAT TGC TTC CGC AAG CAT CCG GAC GCC ACA TAC 
CTA ACG AAG GCG TTC GTA GGC CTG CGG TGT ATG 



Ser Arg Cys Gly Ser Gly Pro Trp He Thr Pro 

1783 TCT CGG TGC GGC TCC GGT CCC TGG ATC ACA CCC 
AGA GCC ACG CCG AGG CCA GGG ACC TAG TGT GGG 



Arg Cys Leu Val Asp Tyr Pro Tyr Arg Leu Trp 

1816 AGG TGC CTG GTC GAC TAC CCG TAT AGG CTT TGG 
TCC ACG GAC CAG CTG ATG GGC ATA TCC GAA ACC 



His Tyr Pro Cys Thr He Asn Tyr Thr He Phe 
1849 CAT TAT CCT TGT ACC ATC AAC TAC ACC ATA TTT 
GTA ATA GGA ACA TGG TAG TTG ATG TGG TAT AAA 
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Lys lie Arg Met Tyr Val Gly Gly val Glu His 

1882 AAA ATC AGG ATG TAC GTG GGA GGG GTC GAA CAC 
TTT TAG TCC TAC ATG CAC CCT CCC CAG CTT GTG 



Arg Leu Glu Ala Ala Cys Asn Trp Thr Arg Gly 

1915 AGG CTG GAA GCT GCC TGC AAC TGG ACG CGG GGC 
TCC GAC CTT CGA CGG ACG TTG ACC TGC GCC CCG 



Glu Arg Cys Asp Leu Glu Asp Arg Asp Arg Ser 

1948 GAA CGT TGC GAT CTG GAA GAC AGG GAC AGG TCC 
CTT GCA ACG CTA GAC CTT CTG TCC CTG TCC AGG 



Glu Leu Ser Pro Leu Leu Leu Thr Thr Thr Gin 
1981 GAG CTC AGC CCG TTA CTG CTG ACC ACT ACA CAG 
CTC GAG TCG GGC AAT GAC GAC TGG TGA TGT GTC 



Trp Gin Val Leu Pro Cys Ser Phe Thr Thr Leu 

2014 TGG CAG GTC CTC CCG TGT TCC TTC ACA ACC CTA 
ACC GTC CAG GAG GGC ACA AGG AAG TGT TGG GAT 



Pro Ala Leu Ser Thr Gly Leu He His Leu His 

2047 CCA GCC TTG TCC ACC GGC CTC ATC CAC CTC CAC 
GGT CGG AAC AGG TGG CCG GAG TAG GTG GAG GTG 



Gin Asn lie Val Asp Val Gin Tyr Leu Tyr Gly 

2080 CAG AAC ATT GTG GAC GTG CAG TAC TTG TAC GGG 
GTC TTG TAA CAC CTG CAC GTC ATG AAC ATG CCC 



Val Gly Ser Ser lie Ala Ser Trp Ala lie Lys 
2113 GTG GGG TCA AGC ATC GCG TCC TGG GCC ATT AAG 
CAC CCC AGT TCG TAG CGC AGG ACC CGG TAA TTC 



Trp Glu Tyr Val Val Leu Leu Phe Leu Leu Leu 

2146 TGG GAG TAC GTC GTT CTC CTG TTC CTT CTG CTT 
ACC CTC ATG CAG CAA GAG GAC AAG GAA GAC GAA 
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Ala Asp Ala Arg Val cys Ser Cys Leu Trp Net 

2179 GCA GAC GCG CGC GTC TGC TCC TGC TTG T6G ATO 
CGT CTG CGC GCG CAG ACG AGG ACG AAC ACC TAC 



Met Leu Leu lie Ser Gin Ala Glu Ala Ala Leu 

2212 ATG CTA CTC ATA TCC CAA GCG GAG GCG GCT TTG 
TAC GAT GAG TAT AGG GTT CGC CTC CGC CGA AAC 



Glu Asn Leu Val He Leu Asn Ala Ala Ser Leu 

2245 GAG AAC CTC GTA ATA CTT AAT GCA GCA TCC CTG 
CTC TTG GAG CAT TAT GAA TTA CGT CGT AGG GAC 



Ala Gly Thr His Gly Leu Val Ser Phe Leu Val 

2278 GCC GGG ACG CAC GGT CTT GTA TCC TTC CTC GTG 
CGG CCC TGC GTG CCA GAA CAT AGG AAG GAG CAC 



Phe Phe eye Phe Ala Trp Tyr Leu Lys Gly Lys 

2311 TTC TTC TGC TTT GCA TGG TAT TTG AAG GGT AAG 
AAG AAG ACG AAA CGT ACC ATA AAC TTC CCA TTC 



Trp Val Pro Gly Ala Val Tyr Thr Phe Tyr Gly 
2344 TGG GTG CCC GGA GCG GTC TAC ACC TTC TAC GGG 
ACC CAC GGG CCT CGC CAG ATG TGG AAG ATG CCC 



Met Trp Pro Leu Leu Leu Leu Leu Leu Ala Leu 

2377 ATG TGG CCT CTC CTC CTG CTC CTG TTG GCG TTG 
TAC ACC GGA GAG GAG GAC GAG GAC AAC CGC AAC 



Pro Gin Arg Ala Tyr Ala Leu Asp Thr Glu Val 
2410 CCC CAG CGG GCG TAC GCG CTG GAC ACG GAG GTG 
GGG GTC GCC CGC ATG CGC GAC CTG TGC CTC CAC 



Ala Ala Ser Cys Gly Gly Val Val Leu Val Gly 

2443 GCC GCG TCG TGT GGC <3GT GTT GTT CTC GTC GGG 
CGG CGC AGC ACA CCG CCA CAA CAA GAG CAG CCC 
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Leu Mot Ala Leu Thr Leu Ser Pro Tyr Tyr Lys 
2476 TTG ATG GCG CTG ACT CTG TCA CCA TAT TAC AAG 
AAC TAC CGC GAC TGA 6AC A6T 6GT ATA ATG TTC 



Arg Tyr lie Ser Trp cys Leu Trp Trp Leu Gin 

2509 CGC TAT ATC AGC TGG TGC TTG TGG TGG CTT CAG 
GCG ATA TAG TCG ACC ACG AAC ACC ACC GAA GTC 



Tyr Phe Leu Thr Arg Val Glu Ala Gin Leu His 

2542 TAT TTT CTG ACC AGA GTG GAA GCG CAA CTG CAC 
ATA AAA GAC TGG TCT CAC CTT CGC GTT GAC GTG 



Val Trp He Pro Pro Leu Asn Val Arg Gly Gly 

2575 GTG TGG ATT CCC CCC CTC AAC GTC CGA GGG GGG 
CAC ACC TAA GGG GGG GAG TTG CAG GCT CCC CCC 



Arg Asp Ala Val He Leu Leu Met Cys Ala Val 

2608 CGC GAC GCC GTC ATC TTA CTC ATG TGT GCT GTA 
GCG CTG CGG CAG TAG AAT GAG TAC ACA CGA CAT 



His Pro Thr Leu Val Phe Asp He Thr Lys Leu 

2641 CAC CCG ACT CTG GTA TTT GAC ATC ACC AAA TTG 
GTG GGC TGA GAC CAT AAA CTG TAG TGG TTT AAC 



Leu Leu Ala Val Phe Gly Pro Leu Trp He Leu 
2674 CTG CTG GCC GTC TTC GGA CCC CTT TGG ATT CTT 
GAC GAC CGG CAG AAG CCT GGG GAA ACC TAA GAA 



Gin Ala Ser Leu Leu Lys Val Pro Tyr Phe Val 

2707 CAA GCC AGT TTG CTT AAA GTA CCC TAC TTT GTG 
GTT CGG TCA AAC GAA TTT CAT GGG ATG AAA CAC 



Arg Val Gin Gly Leu Leu Arg Phe Cys Ala Leu 

2740 CGC GTC CAA GGC CTT CTC CGG TTC TGC GCG TTA 
GCG CAG GTT CCG GAA GAG GCC AAG ACG CGC AAT 
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Ala Arg Lys Met Zle Gly Gly His Tyr Val Gin 

2773 GCG CGG AAG AT6 ATC GGA G6C CAT TAC GTG CAA 
CGC GCC TTC TAC TAG CCT CCG GTA ATG CAC GTT 



Mat Val Zle He Lys Leu Gly Ala Leu Thr Gly 

2806 ATG GTC ATC ATT AAG TTA GGG GCG CTT ACT GGC 
TAC CAG TAG TAA TTC AAT CCC CGC GAA TGA CCG 



Thr Tyr Val Tyr Asn His Leu Thr Pro Leu Arg 

2839 ACC TAT GTT TAT AAC CAT CTC ACT CCT CTT CGG 
TGG ATA CAA ATA TTG GTA GAG TGA GGA GAA GCC 



Asp Trp Ala His Asn Gly Leu Arg Asp Leu Ala 
2872 GAC TGG GCG CAC AAC GGC TTG CGA GAT CTG GCC 
CTG ACC CGC GTG TTG CCG AAC GCT CTA GAC CGG 



Val Ala Val Glu Pro Val Val Phe Ser Gin Met 
2905 GTG GCT GTA GAG CCA GTC GTC TTC TCC CAA ATG 
CAC CGA CAT CTC GGT CAG CAG AAG AGG GTT TAC 



Glu Thr Lys Leu He Thr Trp Gly Ala Asp Thr 

2938 GAG ACC AAG CTC ATC ACG TGG GGG GCA GAT ACC 
CTC TGG TTC GAG TAG TGC ACC CCC CGT CTA TGG 



Ala Ala Cys Gly Asp He lie Asn Gly Leu Pro 

2971 GCC GCG TGC GGT GAC ATC ATC AAC GGC TTG CCT 
CGG CGC ACG CCA CTG TAG TAG TTG CCG AAC GGA 



Val Ser Ala Arg Arg Gly Arg Glu Zle Leu Leu 

3004 GTT TCC GCC CGC AGG GGC CGG GAG ATA CTG CTC 
CAA AGG CGG GCG TCC CCG GCC CTC TAT GAC GAG 



Gly Pro Ala Asp Gly Met Val Ser Lys Gly Trp 
3037 GGG CCA GCC GAT GGA ATG GTC TCC AAG GGG TGG 
CCC GGT CGG CTA CCT TAC CAG AGG TTC CCC ACC 
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Arg Leu Leu Ala Pro lie Thr Ala Tyr Ala Gin 

3070 AGG TTG CTG GCG CCC ATC ACG GCG TAC GCC CAG 
TCC AAC GAC CGC GGG TAG TGC CGC ATG CGG GTC 



Gin Thr Arg Gly Leu Lou Gly Cys He n» Thr 

3103 CAG ACA AGG GGC CTC CTA GGG TGC ATA ATC ACC 
GTC TGT TCC CCG GAG GAT CCC ACG TAT TAG TGG 



Ser Lou Thr Gly Arg Asp Lya Asn Gin Val Glu 

3136 AGC CTA ACT GGC CGG GAC AAA AAC CAA GTG GAG 
TCG GAT TGA CCG GCC CTG TTT TTG GTT CAC CTC 



Gly Glu Val Gin He Val Ser Thr Ala Ala Gin 

3169 GGT GAG GTC CAG ATT GTG TCA ACT GCT GCC CAA 
CCA CTC CAG GTC TAA CAC AGT TGA CGA CGG GTT 



Thr Phe Leu Ala Thr Cys He Asn Gly Val Cys 

3202 ACC TTC CTG GCA ACG TGC ATC AAT GGG GTG TGC 
TGG AAG GAC CGT TGC ACG TAG TTA CCC CAC ACG 



Trp Thr Val Tyr His Gly Ala Gly Thr Arg Thr 

3235 TGG ACT GTC TAC CAC GGG GCC GGA ACG AGG ACC 
ACC TGA CAG ATG GTG CCC CGG CCT TGC TCC TGG 



He Ala Ser Pro Lys Gly Pro Val He Gin Met 

3268 ATC GCG TCA CCC AAG GGT CCT GTC ATC CAG ATG 
TAG CGC AGT GGG TTC CCA GGA CAG TAG GTC TAC 



Tyr Thr Asn Val Asp Gin Asp Leu val Gly Trp 

3301 TAT ACC AAT GTA GAC CAA GAC CTT GTG GGC TGG 
ATA TGG TTA CAT CTG GTT CTG GAA CAC CCG ACC 



Pro Ala Pro Gin Gly Ser Arg Ser Leu Thr Pro 

3334 ' CCC GCT CCG CAA GGT AGC CGC TCA TTG ACA CCC 
GGG CGA GGC GTT CCA TCG GCG AGT AAC TGT GGG 
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Cya Thr Cys Gly ser Ser Asp Leu Tyr Leu Val 

3367 TGC ACT TGC GGC TCC TCG GAC CTT TAC CTG GTC 
ACG T6A ACG CCG AGG AGC CTG GAA ATG GAC CAG 



Thr Arg His Ala Asp Val He Pro val Arg Arg 

3400 ACG AGG CAC GCC GAT GTC ATT CCC GTG CGC CGG 
TGC TCC GTG CGG CTA CAG TAA GGG CAC GCG GCC 



Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro 

3433 CGG GGT GAT AGC AGG GGC AGC CTG CTG TCG CCC 
GCC CCA CTA TCG TCC CCG TCG GAC GAC AGC GGG 



Arg Pro Zle Ser Tyr Leu Lys Gly Ser Ser Gly 

3466 CGG CCC ATT TCC TAC TTG AAA GGC TCC TCG GGG 
GCC GGG TAA AGG ATG AAC TTT CCG AGG AGC CCC 



Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val 

3499 GGT CCG CTG TTG TGC CCC GCG GGG CAC GCC GTG 
CCA GGC GAC AAC ACG GGG CGC CCC GTG CGG CAC 



Gly He Phe Arg Ala Ala Val Cys Thr Arg Gly 

3532 GGC ATA TTT AGG GCC GCG GTG TGC ACC CGT GGA 
CCG TAT AAA TCC CGG CGC CAC ACG TGG GCA CCT 



Val Ala Lys Ala Val Asp Phe He Pro Val Glu 

3565 GTG GCT AAG GCG GTG GAC TTT ATC CCT GTG GAG 
CAC CGA TTC CGC CAC CTG AAA TAG GGA CAC CTC 



Asn Leu Glu Thr Thr Met Arg Ser Pro Val Phe 

3598 AAC CTA GAG ACA ACC ATG AGG TCC CCG GTG TTC 
TTG GAT CTC TGT TGG TAC TCC AGG GGC CAC AAG 



Thr Asp Asn Ser Ser Pro Pro Val Val Pro Gin 

3631 ACG GAT AAC TCC TCT CCA CCA GTA GTG CCC CAG 
TGC CTA TTG AGG AGA GGT GGT CAT CAC GGG GTC 
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3664 



3697 



3730 



3763 



3796 



3829 



3862 



3895 



3928 



Ser Phe Gin Val Ala His Leu His Ala Pro Thr 

A6C TTC CAG GTG GCT CAC CTC CAT GCT CCC ACA 
TCG AAG GTC CAC CGA GTG GAG GTA CGA GGG TGT 



Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala 

GGC AGC GGC AAA AGC ACC AAG GTC CCG GCT GCA 
CCG TCG CCG TTT TCG TGG TTC CAG GGC CGA CGT 



Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu 

TAT GCA GCT CAG GGC TAT AAG GTG CTA GTA CTC 
ATA CGT CGA GTC CCG ATA TTC CAC GAT CAT GAG 



Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly 
AAC CCC TCT GTT GCT GCA ACA CTG GGC TTT GGT 
TTG GGG AGA CAA CGA CGT TGT GAC CCG AAA CCA 



Ala Tyr Met Ser Lys Ala His Gly lie Asp Pro 

GCT TAC ATG TCC AAG GCT CAT GGG ATC GAT CCT 
CGA ATG TAC AGG TTC CGA GTA CCC TAG CTA GGA 



Asn He Arg Thr Gly Val Arg Thr He Thr Thr 

AAC ATC AGG ACC GGG GTG AGA ACA ATT ACC ACT 
TTG TAG TCC TGG CCC CAC TCT TGT TAA TGG TGA 



Gly Ser 

GGC AGC 
CCG TCG 



lie Thr Tyr Ser Thr Tyr Gly Lys 

ATC ACG TAC TCC ACC TAC GGC AAG 
TAG TGC ATG AGG TGG ATG CCG TTC 



Phe Leu Ala Asp Gly Gly cys Ser Gly Gly Ala 

TTC CTT GCC GAC GGC GGG TGC TCG GGG GGC GCT 
AAG GAA CGG CTG CCG CCC ACG AGC CCC CCG CGA 



Tyr Asp He He He Cys Asp Glu Cys His Ser 
TAT GAC ATA ATA ATT TGT GAC GAG TGC CAC TCC 
ATA CTG TAT TAT TAA ACA CTG CTC ACG GTG AGG 
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Thr Asp Ala Thr Ser lie Leu Gly lie Gly Thr 

3961 ACG GAT GCC ACA TCC ATC TTG GGC ATC GGC ACT 
TGC CTA CGG TGT A66 TAG AAC CCG TAG CCG TGA 



Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg 

3994 GTC CTT GAC CAA GCA GAG ACT GCG GGG GCG AGA 
GAG GAA CTG GTT CGT CTC TGA CGC CCC CGC TCT 



Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly 

4027 CTG GTT GTG CTC GCC ACC GCC ACC CCT CCG GGC 
GAC CAA CAC GAG CGG TGG CGG TGG GGA GGC CCG 



Ser Val Thr Val Pro His Pro Asn lie Glu Glu 

4060 TCC GTC ACT GTG CCC CAT CCC AAC ATC GAG GAG 
AGG CAG TGA CAC GGG GTA GGG TTG TAG CTC CTC 



Val Ala Leu Ser Thr Thr Gly Glu lie Pro Phe 

4093 GTT GCT CTG TCC ACC ACC GGA GAG ATC CCT TTT 
CAA CGA GAC AGG TGG TGG CCT CTC TAG GGA AAA 



Tyr Gly Lys Ala He Pro Leu Glu Val lie Lys 

4126 TAC GGC AAG GCT ATC CCC CTC GAA GTA ATC AAG 
ATG CCG TTC CGA TAG GGG GAG CTT CAT TAG TTC 



Gly Gly Arg His Leu He Phe Cys His Ser Lys 

4159 GGG GGG AGA CAT CTC ATC TTC TGT CAT TCA AAG 
CCC CCC TCT GTA GAG TAG AAG ACA GTA AGT TTC 



Lys Lys cys Asp Glu Leu Ala Ala Lys Leu Val 

4192 AAG AAG TGC GAC GAA CTC GCC GCA AAG CTG GTC 
TTC TTC ACG CTG CTT GAG CGG CGT TTC GAC CAG 



Ala Leu Gly He Asn Ala Val Ala Tyr Tyr Arg 

4225 GCA TTG GGC ATC AAT GCC GTG GCC TAC TAC CGC 
CGT AAC CCG TAG TTA CGG CAC CGG ATG ATG GCG 
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Gly Leu Asp Val Ser Val lie Pro Thr Ser Gly 
4258 6GT CTT 6AC 6TG TCC GTC ATC CCG ACC AGC GGC 
CCA GAA CTG CAC AGG CAG TAG GGC TGG TCG CCG 



Asp Val Val Val Val Ala Thr Asp Ala Leu Net 

4291 GAT GTT GTC GTC GTG GCA ACC GAT GCC CTC ATG 
CTA CAA CAG CAG CAC CGT TGG CTA CGG GAG TAC 



Thr Gly Tyr Thr Gly Asp Fhe Asp Ser Val He 

4324 ACC GGC TAT ACC GGC GAC TTC GAC TCG GTG ATA 
TGG CCG ATA TGG CCG CTG AAG CTG AGC CAC TAT 



Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp 

4357 GAC TGC AAT ACG TGT GTC ACC CAG ACA GTC GAT 
CTG ACG TTA TGC ACA CAG TGG GTC TGT CAG CTA 



Phe Ser Leu Asp Pro Thr Phe Thr lie Glu Thr 

4390 TTC AGC CTT GAC CCT ACC TTC ACC ATT GAG ACA 
AAG TCG GAA CTG GGA TGG AAG TGG TAA CTC TGT 



lie Thr Leu Pro Gin Asp Ala Val Ser Arg Thr 

4423 ATC ACG CTC CCC CAG GAT GCT GTC TCC CGC ACT 
TAG TGC GAG GGG GTC CTA CGA CAG AGG GCG TGA 



Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro 

4456 CAA CGT CGG GGC AGG ACT GGC AGG GGG AAG CCA 
GTT GCA GCC CCG TCC TGA CCG TCC CCC TTC GGT 



Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg 
4489 GGC ATC TAC AGA TTT GTG GCA CCG GGG GAG CGC 
CCG TAG ATG TCT AAA CAC CGT GGC CCC CTC GCG 



Pro Ser Gly 

4522 CCC TCC GGC 
GGG AGG CCG 



Met Phe Asp Ser 

ATG TTC GAC TCG 
TAC AAG CTG AGC 
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Glu Cys Tyr Asp Ala Gly cys Ala Trp Tyr Glu 

4555 GAG TGC TAT GAC GCA GGC TGT GCT TGG TAT GAG 
CTC ACG ATA CTG OGT CCG ACA CGA ACC ATA CTC 



Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg 
4588 CTC ACG CCC GCC GAG ACT ACA GTT AGG CTA CGA 
GAG TGC GGG CGG CTC TGA TGT CAA TCC GAT GCT 



Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys 

4621 CCG TAC ATG AAC ACC CCG GGG CTT CCC GTG TGC 
CGC ATG TAC TTG TGG GGC CCC GAA GGG CAC ACG 



Gin Asp His Leu Glu Phe Trp Glu Gly Val Phe 

4654 CAG GAC CAT CTT GAA TTT TGG GAG GGC GTC TTT 
GTC CTG GTA GAA CTT AAA ACC CTC CCG CAG AAA 



Thr Gly Leu Thr His He Asp Ala His Phe Leu 
4687 ACA GGC CTC ACT CAT ATA GAT GCC CAC TTT CTA 
TGT CCG GAG TGA GTA TAT CTA CGG GTG AAA GAT 



Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro 

4720 TCC CAG ACA AAG CAG AGT GGG GAG AAC CTT CCT 
AGG GTC TGT TTC GTC TCA CCC CTC TTG GAA GGA 



Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala 

4753 TAC CTG GTA GCG TAC CAA GCC ACC GTG TGC GCT 
ATG GAC CAT CGC ATG GTT CGG TGG CAC ACG CGA 



Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin 

4786 AGG GCT CAA GCC CCT CCC CCA TCG TGG GAC CAG 
TCC CGA GTT CGG GGA GGG GGT AGC ACC CTG GTC 

» 

Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr 

4819 ATG TGG AAG TGT TTG ATT CGC CTC AAG CCC ACC 
TAC ACC TTC ACA AAC TAA GCG GAG TTC GGG TGG 
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Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu 

4852 CTC CAT GGG CCA ACA CCC CTG CTA TAC AGA CTG 
GAG GTA CCC GGT TGT GGG GAC GAT ATG TCT GAC 



Gly Ala Val Gin Asn Glu lie Thr Leu Thr His 

4885 GGC GCT GTT CAG AAT GAA ATC ACC CTG ACG CAC 
CCC CGA CAA GTC TTA CTT TAG TG6 GAC TGC GTG 



Pro Val Thr Lys Tyr lie Met Thr Cys Met Ser 

4918 CCA GTC ACC AAA TAC ATC ATG ACA TGC ATG TCG 
GGT CAG TGG TTT ATG TAG TAC TGT ACG TAC AGC 



Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val 
4951 GCC GAC CTG GAG GTC GTC ACG AGC ACC TGG GTG 
CGG CTG GAC CTC CAG CAG TGC TCG TGG ACC CAC 



Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 

4984 CTC GTT GGC GGC GTC CTG GCT GCT TTG GCC GGG 
GAG CAA CCG CCG CAG GAC CGA CGA AAC CGG CGC 



Tyr Cys Leu Ser Thr Gly Cys Val Val He Val 

5017 TAT TGC CTG TCA ACA GGC TGC GTG GTC ATA GTG 
ATA ACG GAC AGT TGT CCG ACG CAC CAG TAT CAC 



Gly Arg Val Val Leu Ser Gly Lys Pro Ala He 

5050 GGC AGG GTC GTC TTG TCC GGG AAG CCG GCA ATC 
CCG TCC CAG CAG AAC AGG CCC TTC GGC CGT TAG 



Ha Pro Asp Arg Glu Val Leu Tyr Arg Glu Phe 

5083 ATA CCT GAC AGG GAA GTC CTC TAC CGA GAG TTC 
TAT GGA CTG TCC CTT CAG GAG ATG GCT CTC AAG 



Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro 

5116 GAT GAG ATG GAA GAG TGC TCT CAG CAC TTA CCG 
CTA CTC TAC CTT CTC ACG AGA GTC GTG AAT GGC 
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Tyr He Glu Gin Gly Mat Met Leu Ala Glu Gin 

5149 TAC ATC GAG CAA GGG ATG ATG CTC GCC GAG CAG 
ATG TAG CTC GTT CCC TAC TAC GAG CGG CTC GTC 



Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr 

5182 TTC AAG CAG AAG GCC CTC GGC CTC CTG CAG ACC 
AAG TTC GTC TTC CGG GAG CCG GAG GAC GTC TGG 



Ala Ser Arg Gin Ala Glu Val lie Ala Pro Ala 

5215 GCG TCC CGT CAG GCA GAG GTT ATC GCC CCT GCT 
CGC AGG GCA GTC CGT CTC CAA TAG CGG GGA CGA 



Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe 

5248 GTC CAG ACC AAC TGG CAA AAA CTC GAG ACC TTC 
CAG GTC TGG TTG ACC GTT TTT GAG CTC TGG AAG 



Trp Ala Lys His Met Trp Asn Phe lie Ser Gly 
5281 TGG GCG AAG CAT ATG TGG AAC TTC ATC AGT GGG 
ACC CGC TTC GTA TAC ACC TTG AAG TAG TCA CCC 



lie Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro 
5314 ATA CAA TAC TTG GCG GGC TTG TCA ACG CTG CCT 
TAT GTT ATG AAC CGC CCG AAC AGT TGC GAC GGA 



Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe 

5347 GGT AAC CCC GCC ATT GCT TCA TTG ATG GCT TTT 
CCA TTG GGG CGG TAA CGA AGT AAC TAC CGA AAA 



Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Ser 

5380 ACA GCT GCT GTC ACC AGC CCA CTA ACC ACT AGC 
TGT CGA CGA CAG TGG TCG GGT GAT TGG TGA TCG 



Gin Thr Leu Leu Phe Asn lie Leu Gly Gly Trp 

5413 CAA ACC CTC CTC TTC AAC ATA TTG GGG GGG TGG 
GTT TGG GAG GAG AAG TTG TAT AAC CCC CCC ACC 
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Val Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala 

5446 GTG GCT GCC GAG CTC GCC GCC CCC GGT GCC GCT 
CAC CGA CGG GTC GAG CGG OGG GGG CCA CGG CGA 



Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala 

5479 ACT GCC TTT GTG GGC GCT GGC TTA GCT GGC GCC 
TGA CGG AAA CAC CCG CGA CCG AAT CGA CCG CGG 



Ala lie Gly Ser Val Gly Leu Gly Lys Val Leu 

5512 GCC ATC GGC AGT GTT GGA CTG GGG AAG GTC CTC 
CGG TAG CCG TCA CAA CCT GAC CCC TTC CAG GAG 



lie Asp lie Leu Ala Gly Tyr Gly Ala Gly Val 

5545 ATA GAC ATC CTT GCA GGG TAT GGC GCG GGC GTG 
TAT CTG TAG GAA CGT CCC ATA CCG CGC CCG CAC 



Ala Gly Ala Leu Val Ala Phe Lys lie Met Ser 
5578 GCG GGA GCT CTT GTG GCA TTC AAG ATC ATG AGC 
CGC CCT CGA GAA CAC CGT AAG TTC TAG TAC TCG 



Gly Glu Val Pro Ser Thr Glu Asp Leu Val Asn 

5611 GGT GAG GTC CCC TCC ACG GAG GAC CTG GTC AAT 
CCA CTC CAG GGG AGG TGC CTC CTG GAC CAG TTA 



Leu Leu Pro Ala lie Leu Ser Pro Gly Ala Leu 
5644 CTA CTG CCC GCC ATC CTC TCG CCC GGA GCC CTC 
GAT GAC GGG CGG TAG GAG AGC GGG CCT CGG GAG 



Val Val Gly Val Val Cys Ala Ala He Leu Arg 

5677 GTA GTC GGC GTG GTC TGT GCA GCA ATA CTG CGC 
CAT CAG CCG CAC CAG ACA CGT CGT TAT GAC GCG 



Arg His Val Gly Pro Gly Glu Gly Ala Val Gin 

5710 CGG CAC GTT GGC CCG GGC GAG GGG GCA GTG CAG 
GCC GTG CAA CCG GGC CCG CTC CCC CGT CAC GTC 
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Trp Met Aan Arg Leu He Ala Phe Ala Ser Arg 

5743 T6G ATG AAC CG6 CT6 ATA GCC TTC GCC TCC CGG 
ACC TAC TTG GCC GAC TAT CGG AAG CGG AGG GCC 



Gly Asn His Val Ser Pro Thr His Tyr Val Pro 
5776 GGG AAC CAT GTT TCC CCC ACG CAC TAC GTG CCG 
CCC TTG GTA CAA AGG GGG TGC GTG ATG CAC GGC 



Glu Ser Asp Ala Ala Ala Arg Val Thr Ala lie 

5809 GAG AGC GAT GCA GCT GCC CGC GTC ACT GCC ATA 
CTC TCG CTA CGT CGA CGG GCG CAG TGA CGG TAT 



Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg 

5842 CTC AGC AGC CTC ACT GTA ACC CAG CTC CTG AGG 
GAG TCG TCG GAG TGA CAT TGG GTC GAG GAC TCC 



Arg Leu His Gin Trp He Ser Ser Glu Cys Thr 
5875 CGA CTG CAC CAG TGG ATA AGC TCG GAG TGT ACC 
GCT GAC GTG GTC ACC TAT TCG AGC CTC ACA TGG 



Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He 

5908 ACT CCA TGC TCC GGT TCC TGG CTA AGG GAC ATC 
TGA GGT ACG AGG CCA AGG ACC GAT TCC CTG TAG 



Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe 
5941 TGG GAC TGG ATA TGC GAG GTG TTG AGC GAC TTT 
ACC CTG ACC TAT ACG CTC CAC AAC TCG CTG AAA 



Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin 
5974 AAG ACC TGG CTA AAA GCT AAG CTC ATG CCA CAG 
TTC TGG ACC GAT TTT CGA TTC GAG TAC GGT GTC 



Leu Pro Gly He Pro Phe Val Ser Cys Gin Arg 
6007 CTG CCT GGG ATC CCC TTT GTG TCC TGC CAG CGC 
GAC GGA CCC TAG GGG AAA CAC AGG ACG GTC GCG 
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6040 



6073 



6106 



6139 



6172 



6205 



6238 



6271 



6304 



Gly Tyr Lys Gly Val Trp Arg val Asp Gly He 
GGG TAT AAG GGG GTC TGG CGA GTG GAC GGC ATC 
CCC ATA TTC CCC CAG ACC GCT CAC CTG CCG TAG 



Net His Thr Arg Cys His Cys Gly Ala Glu He 

ATG CAC ACT CGC TGC CAC TGT GGA GCT GAG ATC 
TAC GTG TGA GCG ACG GTG ACA CCT CGA CTC TAG 



Thr Gly His Val Lys Asn Gly Thr Met Arg lie 

ACT GGA CAT GTC AAA AAC GGG ACG ATG AGG ATC 
TGA CCT GTA CAG TTT TTG CCC TGC TAC TCC TAG 



Val Gly Pro Arg Thr Cys Arg Asn Met Trp Ser 

GTC GGT CCT AGG ACC TGC AGG AAC ATG TGG ACT 
CAG CCA GGA TCC TGG ACG TCC TTG TAC ACC TCA 



Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly 

GGG ACC TTC CCC ATT AAT GCC TAC ACC ACG GGC 
CCC TGG AAG GGG TAA TTA CGG ATG TGG TGC CCG 



Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Thr 

CCC TGT ACC CCC CTT CCT GCG CCG AAC TAC ACG 
GGG ACA TGG GGG GAA GGA CGC GGC TTG ATG TGC 



Phe Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr 
TTC GCG CTA TGG AGG GTG TCT GCA GAG GAA TAT 
AAG CGC GAT ACC TCC CAC AGA CGT CTC CTT ATA 



Val Glu Zle Arg Gin Val Gly Asp Phe His Tyr 

GTG GAG ATA AGG CAG GTG GGG GAC TTC CAC TAC 
CAC CTC TAT TCC GTC CAC CCC CTG AAG GTG ATG 



Val Thr Gly Met Thr Thr Asp Asn Leu Lys cys 

GTG ACG GGT ATG ACT ACT GAC AAT CTC AAA TGC 
CAC TGC CCA TAC TGA TGA CTG TTA GAG TTT ACG 

FIG. 12-22 
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6337 



6370 



6403 



6436 



6469 



6502 



6535 



6568 



6601 



Pro Cys Gin val Pro Ser Pro Glu Phe Phe Thr 

CCG TGC CAG GTC CCA TCG CCC GAA TTT TTC ACA 
GGC ACG GTC CAG GGT AGC GGG CTT AAA AAG TGT 



Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala 

GAA TTG GAC GGG GTG CGC CTA CAT AGG TTT GCG 
CTT AAC CTG CCC CAC GCG GAT GTA TCC AAA CGC 



Pro Pro Cys Lys 

CCC CCC TGC AAG 
GGG GGG ACG TTC 



Leu Leu Arg Glu Glu Val 

TTG CTG CGG GAG GAG GTA 
AAC GAC GCC CTC CTC CAT 



Ser Phe Arg Val Gly Leu His Glu Tyr Pro Val 

TCA TTC AGA GTA GGA CTC CAC GAA TAC CCG GTA 
AGT AAG TCT CAT CCT GAG GTG CTT ATG GGC CAT 



Gly Ser Gin Leu Pro cys Glu Pro Glu Pro Asp 

GGG TCG CAA TTA CCT TGC GAG CCC GAA CCG GAC 
CCC AGC GTT AAT GGA ACG CTC GGG CTT GGC CTG 



Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro 

GTG GCC GTG TTG ACG TCC ATG CTC ACT GAT CCC 
CAC CGG CAC AAC TGC AGG TAC GAG TGA CTA GGG 



Ser His lie Thr Ala Glu Ala Ala Gly Arg Arg 

TCC CAT ATA ACA GCA GAG GCG GCC GGG CGA AGG 
AGG GTA TAT TGT CGT CTC CGC CGG CCC GCT TCC 



Leu Ala Arg Gly Ser Pro Pro Ser Val Ala Ser 

TTG GCG AGG GGA TCA CCC CCC TCT GTG GCC AGC 
AAC CGC TCC CCT AGT GGG GGG AGA CAC CGG TCG 



Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu 

TCC TCG GCT AGC CAG CTA TCC GCT CCA TCT CTC 
AGG AGC CGA TCG GTC GAT AGG CGA GGT AGA GAG 

FIG 12-23 
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Lys Ala Thr Cys Thr Ala Asn His Asp Ser Pro 

6634 AA6 GCA ACT TGC ACC 6CT AAC CAT 6AC TCC CCT 
TTC CGT TGA ACG TGG CGA TTG GTA CTG A66 GGA 



Asp Ala Glu Leu He Glu Ala Asn Leu Leu Trp 
6667 GAT GCT GAG CTC ATA GAG GCC AAC CTC CTA TGG 
CTA CGA CTC GAG TAT CTC CGG TTG GAG GAT ACC 



Arg Gin Glu Met Gly Gly Asn He Thr Arg Val 

6700 AGG CAG GAG ATG GGC GGC AAC ATC ACC AGG GTT 
TCC GTC CTC TAC CCG CCG TTG TAG TGG TCC CAA 



Glu Ser Glu Asn Lys Val Val He Leu Asp Ser 

6733 GAG TCA GAA AAC AAA GTG GTG ATT CTG GAC TCC 
CTC AGT CTT TTG TTT CAC CAC TAA GAC CTG AGG 



Fhe Asp Pro Leu Val Ala Glu Glu Asp Glu Arg 

6766 TTC GAT CCG CTT GTG GCG GAG GAG GAC GAG CGG 
AAG CTA GGC GAA CAC CGC CTC CTC CTG CTC GCC 



Glu He Ser Val Pro Ala Glu He Leu Arg Lys 

6799 GAG ATC TCC GTA CCC GCA GAA ATC CTG CGG AAG 
CTC TAG AGG CAT GGG CGT CTT TAG GAC GCC TTC 



Ser Arg Arg Fhe Ala Gin Ala Leu Pro Val Trp 

6832 TCT CGG AGA TTC GCC CAG GCC CTG CCC GTT TGG 
AGA GCC TCT AAG CGG GTC CGG GAC GGG CAA ACC 



Ala Arg Pro Asp Tyr Asn Pro Pro Leu Val Glu 
6865 GCG CGG CCG GAC TAT AAC CCC CCG CTA GTG GAG 
CGC GCC GGC CTG ATA TTG GGG GGC GAT CAC CTC 



Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val 

6898 ACG TGG AAA AAG CCC GAC TAC GAA CCA CCT GTG 
TGC ACC TTT TTC GGG CTG ATG CTT GGT GGA CAC 

FIG. 12-24 
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Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser 

6931 GTC CAT GGC TGT CCG CTT CCA CCT CCA AA6 TCC 
CAG 6TA CCG ACA GGC GAA GGT GGA GGT TTC AGG 



Pro Pro Val Pro Pro Pro Arg Lys Lys Arg Thr 
6964 CCT CCT GTG CCT CCG CCT CGG AAG AAG CGG ACG 
GGA GGA CAC GGA GGC GGA GCC TTC TTC GCC TGC 



Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala 
6997 GTG GTC CTC ACT GAA TCA ACC CTA TCT ACT GCC 
CAC CAG GAG TGA CTT AGT TGG GAT AGA TGA CGG 



Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser 

7030 TTG GCC GAG CTC GCC ACC AGA AGC TTT GGC AGC 
AAC CGG CTC GAG CGG TGG TCT TCG AAA CCG TCG 



Ser Ser Thr Ser Gly lie Thr Gly Asp Asn Thr 

7063 TCC TCA ACT TCC GGC ATT ACG GGC GAC AAT ACG 
AGG AGT TGA AGG CCG TAA TGC CCG CTG TTA TGC 



Thr Thr ser Ser Glu Pro Ala Pro ser Gly Cys 

7096 ACA ACA TCC TCT GAG CCC GCC CCT TCT GGC TGC 
TGT TGT AGG AGA CTC GGG CGG GGA AGA CCG ACG 



Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser 

7129 CCC CCC GAC TCC GAC GCT GAG TCC TAT TCC TCC 
GGG GGG CTG AGG CTG CGA CTC AGG ATA AGG AGG 



Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro 

7162 ATG CCC CCC CTG GAG GGG GAG CCT GGG GAT CCG 
TAC GGG GGG GAC CTC CCC CTC GGA CCC CTA GGC 



Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser 

7195 GAT CTT AGC GAC GGG TCA TGG TCA ACG GTC AGT 
CTA GAA TCG CTG CCC AGT ACC AGT TGC CAG TCA 
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Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cya 

7228 AGT GAG GCC AAC GCG GAG GAT GTC GTG TGC TGC 
TCA CTC CGG TTG CGC CTC CTA CAG CAC ACG ACG 



Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val 
7261 TCA ATG TCT TAC TCT TGG AGA GGC GCA CTC GTC 
AGT TAC AGA ATG AGA ACC TGT CCG CGT GAG CAG 



Thr Pro Cys Ala Ala Glu Glu Gin Lys Leu Pro 
7294 ACC CCG TGC GCC GCG GAA GAA CAG AAA CTG CCC 
TGG GGC ACG CGG CGC CTT CTT GTC TTT GAC GGG 



He Asn Ala Leu Ser Asn Ser Leu Leu Arg His 

7327 ATC AAT GCA CTA AGC AAC TCG TTG CTA CGT CAC 
TAG TTA CGT GAT TCG TTG AGC AAC GAT GCA GTG 



His Asn Leu Val Tyr ser Thr Thr Ser Arg Ser 

7360 CAC AAT TTG GTG TAT TCC ACC ACC TCA CGC AGT 
GTG TTA AAC CAC ATA AGG TGG TGG AGT GCG TCA 



Ala Cys Gin Arg Gin Lys Lys Val Thr Phe Asp 

7393 GCT TGC CAA AGG CAG AAG AAA GTC ACA TTT GAC 
CGA ACG GTT TCC GTC TTC TTT CAG TGT AAA CTG 



Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp 

7426 AGA CTG CAA GTT CTG GAC AGC CAT TAC CAG GAC 
TCT GAC GTT CAA GAC CTG TCG GTA ATG GTC CTG 



Val Leu Lys Glu Val Lys Ala Ala Ala Ser Lys 
7459 GTA CTC AAG GAG GTT AAA GCA GCG GCG TCA AAA 
CAT GAG TTC CTC CAA TTT CGT CGC CGC AGT TTT 



Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala 

7492 GTG AAG GCT AAC TTG CTA TCC GTA GAG GAA GCT 
CAC TTC CGA TTG AAC GAT AGG CAT CTC CTT CGA 

FIG. 12-26 
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Cys ser Leu Thr Pro Pro His Ser Ala Lys Ser 

7525 TGC AGC CTG ACG CCC CCA CAC TCA GCC AAA TCC 
AC6 TCG 6AC TGC GGG GGT GTG AGT CGG TTT AGG 



Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg cys 

7558 AAG TTT GGT TAT GGG GCA AAA GAC GTC CGT TGC 
TTC AAA CCA ATA CCC CGT TTT CTG CAG GCA ACG 



Bis Ala Arg Lys Ala Val Thr His lie Asn Ser 

7591 CAT GCC AGA AAG GCC GTA ACC CAC ATC AAC TCC 
GTA CGG TCT TTC CGG CAT TGG GTG TAG TTG AGG 



Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr 
7624 GTG TGG AAA GAC CTT CTG GAA GAC AAT GTA ACA 
CAC ACC TTT CTG GAA GAC CTT CTG TTA CAT TGT 



Pro lie Asp Thr Thr He Met Ala Lys Asn Glu 

7657 CCA ATA GAC ACT ACC ATC ATG GCT AAG AAC GAG 
GGT TAT CTG TGA TGG TAG TAC CGA TTC TTG CTC 



Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg 

7690 GTT TTC TGC GTT CAG CCT GAG AAG GGG GGT CGT 
CAA AAG ACG CAA GTC GGA CTC TTC CCC CCA GCA 



Lys Pro Ala Arg Leu lie Val Phe Pro Asp Leu 

7723 AAG CCA GCT CGT CTC ATC GTG TTC CCC GAT CTG 
TTC GGT CGA GCA GAG TAG CAC AAG GGG CTA GAC 



Gly val Arg Val Cys Glu Lys Met Ala Leu Tyr 

7756 GGC GTG CGC GTG TGC GAA AAG ATG GCT TTG TAC 
CCG CAC GCG CAC ACG CTT TTC TAC CGA AAC ATG 



Asp Val Val Thr Lys Leu Pro Leu Ala Val Met 

7789 GAC GTG GTT ACA AAG CTC CCC TTG GCC GTG ATG 
CTG CAC CAA TGT TTC GAG GGG AAC CGG CAC TAC 

FIG. 12-27 
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Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly 

7822 GGA AGC TCC TAC GGA TTC CAA TAC TCA CCA GGA 
CCT TCG AGG ATG CCT AAG GTT ATG AGT GGT CCT 



Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys 

7855 CAG CGG GTT GAA TTC CTC GTG CAA GCG TGG AAG 
GTC GCC CAA CTT AAG GAG CAC GTT CGC ACC TTC 



Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp 

7888 TCC AAG AAA ACC CCA ATG GGG TTC TCG TAT GAT 
AGG TTC TTT TGG GGT TAC CCC AAG AGC ATA CTA 



Thr Arg Cys Phe Asp Ser Thr Val Thr Glu Ser 

7921 ACC CGC TGC TTT GAC TCC ACA GTC ACT GAG AGC 
TGG GCG ACG AAA CTG AGG TGT CAG TGA CTC TCG 



Asp lie Arg Thr Glu Glu Ala lie Tyr Gin Cys 

7954 GAC ATC CGT ACG GAG GAG GCA ATC TAC CAA TGT 
CTG TAG GCA TGC CTC CTC CGT TAG ATG GTT ACA 



Cys Asp Leu Asp Pro Gin Ala Arg Val Ala lie 
7987 TGT GAC CTC GAC CCC CAA GCC CGC GTG GCC ATC 
ACA CTG GAG CTG GGG GTT CGG GCG CAC CGG TAG 



Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly 

8020 AAG TCC CTC ACC GAG AGG CTT TAT GTT GGG GGC 
TTC AGG GAG TGG CTC TCC GAA ATA CAA CCC CCG 



Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly 

8053 CCT CTT ACC AAT TCA AGG GGG GAG AAC TGC GGC 
GGA GAA TGG TTA AGT TCC CCC CTC TTG ACG CCG 



Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr 

8086 TAT CGC AGG TGC CGC GCG AGC GGC GTA CTG ACA 
ATA GCG TCC ACG GCG CGC TCG CCG CAT GAC TGT 

FIG. 1 2-28 
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Thr Sor Cys Gly Asn Thr Leu Thr Cys Tyr lie 

8119 ACT AGC TGT GGT AAC ACC CTC ACT TGC TAC ATC 
TGA TCG ACA CCA TTG TGG GAG TGA ACG ATG TAG 



Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu 
8152 AAG GCC CGG GCA GCC TGT CGA GCC GCA GGG CTC 
TTC CGG GCC CGT CGG ACA GCT CGG CGT CCC GAG 



Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp 

8185 CAG GAC TGC ACC ATG CTC GTG TGT GGC GAC GAC 
GTC CTG ACG TGG TAC GAG CAC ACA CCG CTG CTG 



Leu Val Val lie cys Glu Sor Ala Gly Val Gin 

8218 TTA GTC GTT ATC TGT GAA AGC GCG GGG GTC CAG 
AAT CAG GAA TAG ACA CTT TCG CGC CCC CAG GTC 



Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr Glu 

8251 GAG GAC GCG GCG AGC CTG AGA GCC TTC ACG GAG 
CTC CTG CGC CGC TCG GAC TCT CGG AAG TGC CTC 



Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp 

8284 GCT ATG ACC AGG TAC TCC GCC CCC CCT GGG GAC 
CGA TAC TGG TCC ATG AGG CGG GGG GGA CCC CTG 



Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He 

8317 CCC CCA CAA CCA GAA TAC GAC TTG GAG CTC ATA 
GGG GGT GTT GGT CTT ATG CTG AAC CTC GAG TAT 



Thr Ser Cys Ser Ser Asn Val Ser Val Ala His 
8350 ACA TCA TGC TCC TCC AAC GTG TCA GTC GCC CAC 
TGT AGT ACG AGG AGG TTG CAC AGT CAG CGG GTG 



Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr 

8383 GAC GGC GCT GGA AAG AGG GTC TAC TAC CTC ACC 
CTG CCG CGA CCT TTC TCC CAG ATG ATG GAG TGG 

FIG. 12-29 
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Arg Asp Pro Thr Thr Pro Lou Ala Arg Ala Ala 

8416 C6T 6AC CCT ACA ACC CCC CTC 6C6 AGA GCT GCG 
GCA CTG GGA TGT TGG GGG GAG CGC TCT CGA CGC 



Trp Glu Thr Ala Arg His Thr Pro Val Asn Ser 
8449 TGG GAG ACA GCA AGA CAC ACT CCA GTC AAT TCC 
ACC CTC TGT CGT TCT GTG TGA GGT CAG TTA AGG 



Trp Leu Gly Asn He He Met Phe Ala Pro Thr 

8482 TGG CTA GGC AAC ATA ATC ATG TTT GCC CCC ACA 
ACC GAT CCG TTG TAT TAG TAC AAA CGG GGG TGT 



Leu Trp Ala Arg Met He Leu Met Thr His Phe 
8515 CTG TGG GCG AGG ATG ATA CTG ATG ACC CAT TTC 
GAC ACC CGC TCC TAC TAT GAC TAC TGG GTA AAG 



Phe Ser Val Leu He Ala Arg Asp Gin Leu Glu 

8548 TTT AGC GTC CTT ATA GCC AGG GAC CAG CTT GAA 
AAA TCG CAG GAA TAT CGG TCC CTG GTC GAA CTT 



Gin Ala Leu Asp Cys Glu He Tyr Gly Ala Cys 

8581 CAG GCC CTC GAT TGC GAG ATC TAC GGG GCC TGC 
GTC CGG GAG CTA ACG CTC TAG ATG CCC CGG ACG 



Tyr Ser He Glu Pro Leu Asp Leu Pro Pro He 

8614 TAC TCC ATA GAA CCA CTT GAT CTA CCT CCA ATC 
ATG AGG TAT CTT GGT GAA CTA GAT GGA GGT TAG 



He Gin Arg Leu His Gly Leu Ser Ala Phe Ser 
8647 ATT CAA AGA CTC CAT GGC CTC AGC GCA TTT TCA 
TAA GTT TCT GAG GTA CCG GAG TCG CGT AAA AGT 



Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg 

8680 CTC CAC AGT TAC TCT CCA GGT GAA ATT AAT AGG 
GAG GTG TCA ATG AGA GGT CCA CTT TAA TTA TCC 

FIG. 12-30 
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Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro 

8713 GTS GCC GCA TGC CTC AGA AAA CTT GGG GTA CCG 
CAC CGG CGT ACG GAG TCT TTT GAA CCC CAT GGC 



Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser 

8746 CCC TTG CGA GCT TGG AGA CAC CGG GCC CGG AGC 
GGG AAC GCT CGA ACC TCT GTG GCC CGG GCC TCG 



Val Arg Ala Arg Leu Leu Ala Arg Gly Gly Arg 

8779 GTC CGC GCT AGG CTT CTG GCC AGA GGA GGC AGG 
CAG GCG CGA TCC GAA GAC CGG TCT CCT CCG TCC 



Ala Ala He Cys Gly Lys Tyr Leu Phe Asn Trp 

8812 GCT GCC ATA TGT GGC AAG TAC CTC TTC AAC TGG 
CGA CGG TAT ACA CCG TTC ATG GAG AAG TTG ACC 



Ala Val Arg Thr Lys Leu Lys 

8845 GCA GTA AGA ACA AAG CTC AAA C 
CGT CAT TCT TGT TTC GAG TTT G 

FIG. 12-31 
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primer J159S 




Ser Leu Lys Thr 

1 C TCC CTC AAA ACT 
T AG C C 



Gly Phe Leu Ala Ala 

GGG TTT CTT GCC GCG 
C GG T G AG 
Trp Gly 



Leu Phe Tyr Thr His Lys Phe Asn Ala 

29 CTG TTC TAC ACA CAC AAG TTC AAC GCG 
T T CAC T T 

His. Ser 

primer 166A for Jl-1216 



Ser Gly Cys Pro Glu Arg Met Ala Ser 

56 TCC GGA TGC CCG GAG CGC ATG GCC AGC 
A C T T A G C A 

Leu 



Cys Arg Ser lie Asp Lys Phe Asp Gin 

83 TGT CGC TCC ATT GAC AAG TTC GAC CAG 
C AC C AC G T T 
Pro Leu Thr Asp 



Gly Trp Gly Pro 

110 GGA TGG GGT CCC 
C C T 



lie Thr Tyr Ala Gin 

ATC ACC TAT GCT CAA 
GT CAC 

Ser Asn 



Pro Asp Asn Ser Asp Gin Arg Pro Tyr 

137 CCT GAC AAC TCG GAC CAG AGG CCG TAT 
GGA AG GG C C CCCC 

Gly Ser Gly Pro 



Cys Trp His Tyr Ala Pro Arg Gin Cys 
164 TGC TGG CAC TAC GCA CCT CGA CAG TGT 

C C A AA CT C 
Pro Lys Prp 

FIG. 1 3- 1 
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Gly lie Val Pro Ala Ser Gin Val cys 

Jl 191 GGT ATC GTA CCC GCG TCG GAG GTG TGC 

PT T G AA AGT T 

Gly Pro Val Tyr Cys Phe Thr Pro Ser 

Jl 218 GGT CCA GTG TAT TGC TTC ACC CCA AGC 

PT G A T C 



Pro Val Val Val Gly Thr Thr Asp Arg 
Jl 245 CCT GTT GTA GTG GGG ACG ACC GAT CGT 

PT CGG A CAG 



Phe Gly Ala Pro Thr Tyr Asn Trp Gly 

Jl 272 TTC GGC GCC CCT ACG TAT AAC TGG GGG 

PT CG GCCCG T 

SSS. SSSL 



Asp Asn Glu Thr Asp Val Leu Leu Leu 
Jl 299 GAC AAT GAG ACG GAC GTG CTG CTC CTA 

PT A T CTCGT 

Glu Asp Phe Val 



Asn Asn Thr Arg Pro Pro His Gly Asn 

Jl 326 AAC AAC ACG CGG CCC CCG CAC GGC AAC 

PT T . C A A TG T 

LSU 



Trp Phe Gly Cys Thr 

Jl 353 TGG TTC GGC TGT ACA 

FT T CTGGATGAACTCAACTGGATT 

primer 19 9 A 



Nucleotide Hatch: 259/367 (70.6%) 

Amino Acid Match (stringent): 93/122 (76.2%) 

(relaxed): 111/122 (91.0%) 

FIG. 13-2 



113 



EP 0 939 128 A2 



Prototype HCV (PT) sequences different from 
Japanese HCV (Jl) are shown. 

Relaxed amino acid match: Gly-Ala-Pro-Ser-Thr. 
Asp-Glu, Asn-Gln, 

Aug=»Lys=His, Leu-lle»Val-Met, Phe-Trp»Tyr. 
Underline, different amino acid in relaxed 
matching. 

FIG. 13-3 
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Core to NS1 vs. HCV-i 
jl Pro Leu Val 

HCV-1 T CCG CTC GTC 

A 



Gly Ala Pro Leu Gly Gly Ala Ala Arg 

Jl 11 GGC GCC CCC TTA GGG GGC GCT GCC AGG 



Ala Leu Ala His Gly Val Arg Val Leu 

Jl 38 GCC CTG GCA CAT GGT GTC CGG GTT CTG 

G C 



Glu Asp Gly Val Asn Tyr Ala Thr Gly 
Jl 65 GAG GAC GGC GTG AAC TAT GCA ACA GGG 

— A " 







Asn 


Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


lie 


Jl 


92 


AAT 


TTG 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 






— C 


C-T 


— T 


















Phe 


Leu 


Leu 


Ala 


Leu 


Leu 


Ser 


Cys 


Leu 


Jl 


119 


TTC 


CTC 


TTG 


GCT 


CTG 


CTG 


TCC 


TGT 


TTG 






Thr 


He 


Pro 


Ala 


Ser 


Ala 


Tyr 


GlU 


Val 


Jl 


146 


ACC 


ATC 


CCA 


GCT 


TCC 


GCT 


TAT 


GAA 


GTG 






— T 


G-G 

Val 


~C 




~G 


— C 


— C 


C — 

Gin 





Arg Asn Val Ser Gly He Tyr His Val 

Jl 173 CGC AAC GTG TCC GGG ATA TAC CAT GTC 

TCC A-G C-T 

ser Thr Leu 



Thr Asn Asp Cys Ser Asn Ser Ser He 

Jl 200 ACA AAC GAC TGC TCC AAC TCA AGC ATT 

q ~ t C-T ~G — T 



FIG. 14-1 
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Val Tyr Gxu Ala Ala Asp Val He Met 

Jl 227 GTG TAT GAG GCG GCG GAC GTG ATC ATG 

— - — c - — — C — T -CC — C~ 

Ala Leu 



His Ala Pro Gly Cys Val Pro Cys Val 

Jl 254 CAT GCC CCC GGG TGC GTG CCC TGC GTT 

— C A-T — G --- — — C ~T --- 

Thr 



Arg Glu Asn Asn Ser Ser Arg cys Trp 

Jl 281 CGG GAG AAC AAT TCC TCC CGT TGC TGG 

— T GG- ~C G— ~G A-G — T 

Gly Ala 



Val Ala Leu Thr Pro Thr Leu Ala Ala 

Jl 308 GTA GCG CTC ACT CCC ACG CTC GCG GCC 

— A A-G — C ~T G-G — C A — 

Met Val Thr 



Arg Asn Ala Ser Val Pro Thr Thr Thr 

Jl 335 AGG AAT GCC AGC GTC CCC ACT ACG ACA 

G G AA C G-G CAG 

Asp Gly Lys Leu Ala 

Gin 



Leu Arg Arg His Val Asp Leu Leu Val 

Jl 362 TTA CGA CGC CAC GTC GAC TTG CTC GTT 

C-T T A T C T — C 

He 



Gly Thr Ala Ala Phe cys ser Ala Met 

Jl 389 GGG ACG GCT GCT TTC TGC TCC GCT ATG 

GC — C A-C C T — G — C C-C 

Ser Thr Leu Leu 



Jl 



Tyr Val Gly Asp Leu Cys Gly Ser Val 
416 TAC GTG GGG GAT CTC TGC GGA TCT GTT 
— C — A -G C 



RG 14-2 
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Phe Leu He Ser 
443 TTC CTC ATC TCC 
— T — T G — GG- 

Val Gly 



Gin Leu Phe Thr Phe 

CAG CTG TTC ACC TTC 
— A 



Ser Pro Arg Arg His Glu Thr Val Gin 

470 TCG CCT CGC CGG CAT GAG ACA GTA CAG 
— T — C A-G — C — C TG- — G ACG — A 

Trp Thr 



Asp Cys Asn Cys Ser lie Tyr Pro Gly 

497 GAC TGC AAC TGC TCA ATC TAT CCC GGC 
-<5T» — — T — T 

Gly 



His Val ser Gly His Arg Met Ala Trp 

524 CAC GTA TCA GGC CAT CGC ATG GCT TGG 

— T A — A-G — T — C A 

lie Thr 



Asp Met Met Met Asn Trp Ser Pro Thr 

551 GAT ATG ATG ATG AAC TGG TCG CCC ACG 
-c — T 



Ala Ala Leu Val Val Ser Gin Leu Leu 

578 GCA GCC TTA GTG GTG TCG CAG TTA CTC 
A-G — G — G — A A — G-T C-G 

Thr Met Ala 



Arg lie Pro Gin Ala Val Met Asp Met 

605 CGG ATC CCA CAA GCT GTC ATG GAC ATG 
— c A— T 

lie Leu 



Val Ala Gly Ala His Trp Gly Val Leu 

632 GTG GCG GGG GCC CAC TGG GGA GTC CTA 

A-C — T — T ~T -G 

lie 

FIG. 14-3 
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Ala 

Jl 659 GCG 



Gly 


Leu 


Ala 


Tyr 


GGC 


CTT 


GCC 


TAC 




A-A 


~G 


— T 




He 







Tyr Ser Met Val 

TAT TCC ATG GTG 
Phe 



Gly 

Jl 686 GGG 



Asn Trp Ala Lys 

AAC TGG GCT AAG 

= G 



Val 


Leu 


He 


Val 


GTT 


TTG 


ATT 


GTG 


— C 


C — 


G-A 








Val 





Net 

Jl 713 ATG 

C— 
Leu 



Leu Leu Phe Ala 

CTA CTC TTT GCC 
~G — A 



Gly Val Asp Gly 

GGC GTT GAC GGG 

C C- 

Ala 



His Thr Arg Val Thr Gly Gly Val Gin 

Jl 740 CAT ACC CGC GTG ACG GGG GGG GTG CAA 

G-A A C — C A AGT GCC 

Glu His ser Ala 



Gly His Val Thr Ser Thr Leu Thr Ser 
Jl 767 GGC CAC GTC ACC TCT ACA CTC ACG TCC 

ACT GTG — - GGA T-T GTT AG- 

Thr Val Gly Phe Val 



Leu Phe Arg Pro Gly Ala Ser Gin Lys 

Jl 794 CTC TTT AGA CCT GGG GCG TCC CAG AAA 

C-C GC A — C — C AAG C~ — C 

Leu Ala Lys Asn 



He Gin Leu Val Asn Thr Asn Gly Ser 

Jl 821 ATT CAG CTT GTA AAC ACC AAT GGC AGT 

G-C -G A-C C 

Val He 



Trp His He Asn Arg Thr Ala Leu Asn 
Jl 848 TGG CAT ATC AAC AGG ACT GCC CTG AAC 

-C C T — C — G 

Ser 

FIG. 14-4 
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Cys Asn Asp Ser Leu Gin Thr Gly phe 

875 TGC AAT GAC TCC CTC CAA ACT GGG TTC 

— T AG- --- A-C — C — C -GG 

Asn Trp 

Leu Ala Ala 
902 CTT GCC GCG CTG TTC TAC ACA CAC AAG 

T-G — A -G- — T — T CAC C— 

Gly His 



929 TTC AAC GCG TCC GGA TGC CCG GAG CGC 
T-T — A — C — T — T A-G 



Ser lie Asp Lys 

ATG GCC AGC TGT CGC TCC ATT GAC AAG 
956 C-A C — A C — C — AC- G-T 

Leu pro Leu Thr Asp 



Phe Asp Gin Gly Trp Gly Pro lie Thr 
983 TTC GAC CAG GGA TGG GGT CCC ATC ACC 
— T — — c — — C — T — 



Tyr Ala Gin Pro Asp Asn Ser Asp Gin 

1010 TAT GCT CAA CCT GAC AAC TCG GAC CAG 
C AAC GGA AGC GG- C-C 

Asn Gly Ser Gly Pro 



Arg Pro Tyr Cys Trp His Tyr Ala Pro 

1037 AGG CCG TAT TGC TGG CAC TAC GCA CCT 
C-C — C — C C-C —A 



Arg Gin Cys Gly lie Val Pro Ala Ser 

1064 CGA CAG TGT GGT ATC GTA CCC GCG TCG 
AA- -CT — C — T — G AA- 

Lys Pro Lys 

FIG. 14-5 
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Gin Val cys Gly Pro Val Tyr Cys Phe 
1091 CAG GTG TGC GGT CCA GTG TAT TGC TTC 
AGT — T — G — A 



Thr Pro Ser Pro Val Val Val Gly Thr 

1118 ACC CCA AGC CCT GTT GTA GTG GGG ACG 
— T _. c c Q q A 



Thr Asp Arg Phe Gly Ala Pro Thr Tyr 

1145 ACC GAT CGT TTC GGC GCC CCT ACG TAT 
A-G -CG - — — G — C — C — C 



Asn Trp Gly Asp Asn Glu Thr Asp Val 

1172 AAC TGG GGG GAC AAT GAG ACG GAC GTG 

-G — — T —A ~T — >c 

Ser Glu Asp 



Leu Leu Leu Asn Asn Thr Arg Pro Pro 
1199 CTG CTC CTA AAC AAC ACG CGG CCC CCG 

T-C G T T — C A A 

Phe Val 



His Gly Asn Trp Phe Gly Cys Thr 

1226 CAC GGC AAC TGG TTC GGC TGT ACA 

-TG — T — T — 

Leu 

FIG. 1 4-6 
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Gly Asn Trp Phe Gly Cya Thr Trp Met 

TG GGC AAC TGG TTC GGC TGT ACA TGG ATG 
— — T — T — C 



Asn ser Thr Gly Phe Thr Lys Thr Cys 

AAT AGC ACT GGG TTC ACC AAG ACG TGC 

— C TCA A A GT 

Val 



Gly Gly Pro Pro Cys Asn lie Gly Gly 
GGA GGC CCC CCG TGT AAC ATC GGG GGG 

— T — T GT- — A 

Val 



Val Gly Asn Asn Thr Leu Thr cys Pro 

GTC GGC AAC AAC ACC TTG ACC TGC CCC 

-CG C — CA 

Ala His 



Thr Asp Cys Phe Arg Lys Thr 

ACG GAC TGC TTC CGG AAG ACC 
— T — T C CAT - 

His 



Thr 
ACG 
- GAC 
Asp 



Ala Thr Tyr Thr Lys Cys Gly Ser Gly 

GCC ACT TAC ACA AAA TGT GGT TCG GGC 
A T-T CGG ~C — C — C — T 

Ser Arg 



Pro Trp Leu Thr Pro Arg Cys Leu Val 

CCT TGG TTG ACA CCT AGG TGC TTG GTT 

— C A-C C C C 

He 



Asp Tyr Pro Tyr Arg Leu Trp His Tyr 

GAC TAC CCA TAC AGG CTC TGG CAC TAC 

FIG. 15-1 
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Pro Cys Thr Val Asn Phe Thr He Phe 

Jl 219 CCC TGC ACT GTC AAC TTT ACC ATC TTC 
HCV-1 — T — T ~C A — -AC — - —A — T 

He Tyr 



Jl 246 
HCV-1 



Lys Val Arg Met Tyr Val Gly Gly Val 

AAG GTT AGG ATG TAT GTG GGG GGC GTG 
— A A-C — C A — G — C 

He 



GlU His 
Jl 273 GAG CAC 
HCV-1 — A 

FIG. 1 5-2 
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C200 region sequence vs. HCV-l 



Asn Met Ser 

C200 3799 AAT ATG TCC 

HCV-l 3781 ACA CTG GGC TTT GGT GCT T-C 

Thr Leu Gly Phe Gly Ala Tyr 



Lys Ala His Gly Thr Asp Pro Asn lie 

C200 3808 AAG GCA CAT GGC ACC GAC CCC AAC ATC 
HCV-l T — G -T- — T — T — — 



Arg Thr Gly Val Arg Thr lie Thr Thr 
C200 3835 AGA ACT GGG GTA AGG ACC ATC ACC ACA 
HCV-l — G — C — — G — A — A — T — T 



Gly Ala Pro lie Thr Tyr Ser Thr Tyr 

C200 3862 GGT GCC CCC ATT ACG TAC TCC ACC TAT 
HCV-l — C AG C C 

Ser 



Arg Lys Phe Leu Ala Asp Gly Gly Cys 
C200 3889 CGC AAG TTC CTT GCC GAC GGT GGT TGC 
HCV-l G — c ~G 

Gly 



Ser Gly Gly Ala Tyr Asp He He 

C200 3916 TCC GGG GGC GCC TAT GAC ATC ATA A 
HCV-l — G — — — T - — — — A -TT 

lie 



HCV-l 3943 TGT GAC GAG TGC CAC TCC ACG GAT GCC 

Cys Asp Glu Cys His Ser Thr Asp Ala 



HCV-l 



3970 ACA TCC ATC TTG GGC ATC GGC ACT GTC 

Thr Ser He Leu Gly He Gly Thr Val 

FIG. 16-1 
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HCV-1 3997 CTT GAC CAA GCA GAG ACT GCG GGG GCG 

Leu Asp Gin Ala Glu Thr Ala Gly Ala 

HCV-1 4024 AGA CTG GTT GTG CTC GCC ACC GCC ACC 

Arg Leu Val Val Leu Ala Thr Ala Thr 

HCV-1 4051 CCT CCG GGC TCC GTC ACT GTG CCC CAT 

Pro Pro Gly Ser Val Thr Val Pro His 



HCV-1 4078 CCC AAC ATC GAG GAG GTT GCT CTG TCC 

Pro Asn lie Glu Glu Val Ala Leu Ser 



HCV-1 4105 ACC ACC GGA GAG ATC CCT TTT TAC GGC 

Thr Thr Gly Glu lie Pro Phe Tyr Gly 



Ser lie Pro lie Glu Ala He Lys 

C200 4132 A AGC ATC CCC ATC GAG GCC ATC AAG 
HCV-1 AAG GCT C A -TA 

Lys Ala val 



Gly Gly Arg His Leu lie Phe Cys His 

C200 4159 GGG GGA AGG CAT CTC ATC TTC TGC CAT 
HCV-1 — -G — A — — — - — x 



Ser Lys Lys Lys Cys Asp Glu Leu Ala 

C200 4186 TCC AAG AAG AAG TGT GAC GAG CTC GCC 
HCV-1 —A c A 



Ala Lys Leu Ser Ala Leu Gly Leu Asn 
C200 4213 GCA AAG CTG TCA GCC CTC GGA CTC AAT 

HCV-1 GTC — A T-G — C A 

Val He 



Ala Val Ala Tyr Tyr Arg Gly Leu Asp 

C200 4240 GCC GTG GCG TAT TAC CGC GGT CTT GAT 
HCV-1 c ~C C 

FK3. 16-2 
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Val Ser Val He Pro Thr Ser Gly Asp 

C200 4267 GTG TCC GTC ATA CCA ACT AGC GGA GAC 
HCV-1 — c — G — C — C — T 



Val Val Val Val Ala Thr Asp 

C200 4294 GTC GTT GTC GTG GCA ACA GAC GC 4316 
HCV-1 — T — C ~c — T --C CTC 



HCV-1 4321 ATG ACC GGC TAT ACC GGC GAC TTC GAC 

Met Thr Gly Tyr Thr Gly Asp Phe Asp 

HCV-1 4348 TCG GTG ATA GAC TGC AAT ACG TGT GTC 

Ser Val He Asp Cys Asn Thr Cys Val 



HCV-1 4375 ACC CAG ACA GTC GAT TTC AGC CTT GAC 

Thr Gin Thr Val Asp Phe Ser Leu Asp 



HCV-1 4402 CCT ACC TTC ACC ATT GAG ACA ATC ACG 

Pro Thr Phe Thr He Glu Thr He Thr 



HCV-1 4429 CTC CCC CAG GAT GCT GTC TCC CGC ACT 

Leu Pro Gin Asp Ala Val Ser Arg Thr 

HCV-1 4456 CAA CGT CGG GGC AGG ACT GGC AGG GGG 

Gin Arg Arg Gly Arg Thr Gly Arg Gly 

HCV-1 4483 AAG CCA GGC ATC TAC AGA TTT GTG GCA 

Lys Pro Gly He Tyr Arg Phe Val Ala 

HCV-1 4510 CCG GGG GAG CGC CCC TCC GGC ATG TTC 

Pro Gly Glu Arg Pro Ser Gly Met Phe 



HCV-1 4537 GAC TCG TCC GTC CTC TGT GAG TGC TAT 

Asp Ser Ser Val Leu Cys Glu Cys Tyr 

FIG. 16-3 
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HCV-1 4564 GAC GCA GGC TGT GCT TGG TAT GAG CTC 

Asp Ala Gly Cys Ala Trp Tyr Glu Leu 



HCV-1 4591 ACG CCC GCC GAG ACT ACA GTT AGG CTA 

Thr Pro Ala Glu Thr Thr Val Arg Leu 



HCV-1 4618 CGA GCG TAC ATG AAC ACC CCG GGG CTT 

Arg Ala Tyr Met Asn Thr Pro Gly Leu 



HCV-1 4645 CCC GTG TGC CAG GAC CAT CTT GAA TTT 

Pro Val Cys Gin Asp His Leu Glu Phe 



HCV-1 4672 TGG GAG GGC GTC TTT ACA GGC CTC ACT 

Trp Glu Gly Val Phe Thr Gly Leu Thr 



HCV-1 4699 CAT ATA GAT GCC CAC TTT CTA TCC CAG 

His lie Asp Ala His Phe Leu Ser Gin 



HCV-1 4726 ACA AAG CAG AGT GGG GAG AAC CTT CCT 

Thr Lys Gin Ser Gly Glu Asn Leu Pro 

HCV-1 4753 TAC CTG GTA GCG TAC CAA GCC ACC GTG 

Tyr Leu Val Ala Tyr Gin Ala Thr Val 



HCV-1 4780 TGC GCT AGG GCT CAA GCC CCT CCC CCA 

Cys Ala Arg Ala Gin Ala Pro Pro Pro 



HCV-1 4807 TCG TGG GAC CAG ATG TGG AAG TGT TTG 

Ser Trp Asp Gin Met Trp Lys Cys Leu 



HCV-1 4834 ATT CGC CTC AAG CCC ACC CTC CAT GGG 

lie Arg Leu Lys Pro Thr Leu His Gly 



HCV-1 4861 CCA ACA CCC CTG CTA TAC AGA CTG GGC 

Pro Thr Pro Leu Leu Tyr Arg Leu Gly 

FK3. 16-4 
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HCV-1 

HCV-1 

HCV-1 
HCV-1 
HCV-1 
HCV-1 
HCV-1 

C200 
HCV-1 

C200 
HCV-1 

C200 
HCV-1 



4888 GCT GTT CAG AAT GAA ATC ACC CTG ACG 

Ala Val Gin Asn Glu lie Thr Leu Thr 



4915 CAC CCA GTC ACC AAA TAC ATC ATG ACA 

His Pro Val Thr Lys Tyr lie Met Thr 



4942 TGC ATG TCG GCC GAC CTG GAG GTC GTC 

Cys Net Ser Ala Asp Leu Glu Val Val 



4969 ACG AGC ACC TGG GTG CTC GTT GGC GGC 
Thr Ser Thr Trp Val Leu Val Gly Gly 



4996 GTC CTG GCT GCT TTG GCC GCG TAT TGC 

Val Leu Ala Ala Leu Ala Ala Tyr Cys 



5023 CTG TCA ACA GGC TGC GTG GTC ATA GTG 
Leu Ser Thr Gly Cys Val Val lie Val 



5050 GGC AGG GTC GTC TTG TCC GGG AAG CCG 

Gly Arg Val Val Leu Ser Gly Lys Pro 



Glu Val Leu 

GAA GTC CTC 

5077 GCA ATC ATA CCT GAC AGG 

Ala He He Pro Asp Arg 



Tyr Arg Glu Phe Asp Glu Met Glu Glu 

5104 TAC CGA GAG TTC GAT GAG ATG GAA GAG 



Cys 

5131 TGC 



Ala Ser 

GCC TCA 
T-T CAG 

Ser Gin 

FIG. 



His Leu 
CAC CTC 
— T-A 

16-5 



Pro Tyr 
CCC TAC 

— G 



He Glu 
ATC GAA 
G 
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Gin Gly Met Gin Leu Ala Glu Gin Phe 

C200 5158 CAG GGA ATG GAG CTC GCC GAG CAA TTC 
HCV-1 — A — G AT G 

Met 



Lys Gin Lys Ala Leu Gly Leu Leu Gin 

C200 5185 AAG CAG AAG GCG CTC GGG TTG CTG CAA 
HCV-1 — - — — C — C C-C ~G 



Thr Ala Thr Lys Gin Ala Glu Ala Ala 

C200 5212 ACA GCC ACC AAG CAA GCG GAG GCT GOT 
HCV-1 — C — G T — CGT — G — A T- ATC 

Ser Arg Val He 



C200 5239 
HCV-1 



Ala Pro cys 

GCT COG TGT 
— C — T GC- 

Ala 



Glu Ser Met 

GAG TCA ATG 
-TC CAG -CC 

Val Gin Thr 



His Ala Ser 
CAC GCC TCG 
A — TGG CAA 

Asn Trp Gin 



C200 5266 A 

HCV-1 -AA CTC GAG ACC TTC TGG GCG AAG CAT 

Lys Leu Glu Thr Phe Trp Ala Lys His 



HCV-1 5293 ATG TGG AAC TTC ATC AGT GGG ATA CAA TA 

Met Trp Asn Phe lie Ser Gly lie Gin 

FIG. 16-6 
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NS1 Sequence vs. HCV-l 

Leu Gly Asn Trp Phe Gly Cya Thr Trp 
J1 1 G TTG GGC AAT TGG TTC GGT TGC ACC TGG 
HCV-l - C T ... 



HCV-l 



Met Asn Ser Ser Gly Phe Thr Lys Val 

jjjy 29 ATG AAC TCA TCT GGA TTT ACC AAA GTG 

Thr 

cys Gly Ala Pro Pro cys Val He Gly 

Jl 56 TGC GGA GCG CCT CCT TGT GTC ATC GGA 
HCV-l — - ___ 

Ala 



Gly Val Gly Asn Asn Thr Leu Gin Cys 

Jl 83 GGG GTG GGC AAC AAC ACC TTG CAA TGC 

HCV-l -. c , c— — C 

Ala His 

Pro Thr Asp cys Phe Arg Lys His Pro 

Jl 110 CCC ACT GAC TGT TTC CGC AAG CAT CCG 
HCV-l 

Asp Ala Thr Tyr Ser Arg Cys Gly ser 

Jl 137 GAC GCC ACA TAC TCT CGG TGC GGT TCC 



C 



G1 y Pro Trp He Thr Pro Arg Cys Leu 

Jl 164 GGT CCC TGG ATT ACG CCC AGG TGC CTG 
HCV-l c __ A 



Val His Tyr Pro Tyr Arg Leu Trp His 

Jl 191 GTC CAC TAC CCT TAT AGG CTT TGG CAT 
HCV-l 6 — — <s 

Asp 

Tyr Pro Cys Thr Val Asn Tyr Thr Leu 

™ 218 TAT CCC TGT ACT GTC AAC TAC ACC TTG 
HCV-l T C A A-A 

He lie 



FIG. 17-1 
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Jl 

HCV-1 



Phe Lys Val Arg Met Tyr Val Gly Gly 

245 TTC AAA GTC AGG ATG TAC GTG GGA GGG 
— T A — 

He 



Jl 

HCV-1 



Val Glu His Arg Leu Glu Val Ala Cys 

272 GTC GAG CAC AGG CTG GAA GTT GCT TGC 

— A C C 

Ala 



Jl 

HCV-l 



Asn Trp Thr Arg Gly Glu Arg Cys Asp 

299 AAC TGG ACG CGG GGC GAG CGT TGT GAT 
A C 



Jl 

HCV-1 



Leu Asp Asp Arg Asp 

326 CTG GAC GAC AGG GAC A 

A 

GlU 

FIG. 17-2 
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Cora Sequence vs. HCV-1 

H^-1 1 GCGTCTAGCCATGGCGTTAGTATGAGTGTC 

gjy-l 31 GTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCC 

HCV-1 66 ATAGTGGTCTGCGGAACCGGTGAGTACACCGGAAT 

101 TGCCAGGACGACCGGGTCCTTTCTTGGATCAACCC 

J* , 136 GCTCAAT6CCT66A6ATTT666C6TGCCCCCGCGA 

ncv— l — — — — _ 

— ~- A— 

HCV-1 ^ACTGCTAGCCGAGTAGTGTTGGGTCGCGAAAGGC 
jj^ 206 CTTGTGGTACTGCCTGATAGGGTGCTTGCGAGTGC 



jjjy 241 CCCGGGAGGTCTCGTAGACCGTGCATCATG AGC 

,, Thr Asn Pro Lys Pro Gin Arg Lys Thr 

ZL i 274 ACA CCT AAA CCT CAA AGA AAA ACC 

— 6 A A- 

Lys Asn 

Lv f Arg Asn Thr Asn Arg Arg Pro Gin 

HCV-1 ™ ff ACC AAC CGC CGC CCA CAG 

ti ™ Asp Val Lvs Phe Q ly Gly Gly Gin 

Sir, , 328 GAC ^ AAG TTC CCG GGC GGT GGT CAG 
HCV-1 T _ c 

ti « c ™ Val 61y G1 * Val ^ Leu Leu Pro 

Sir , 355 ATC GTT GGT GGA GTT TAC CTG TTG CCG 

"CV— 1 -— — — .... — ___ <"P— — «.«.«. 

* 

, M Arg Arg Gly Pro Arg Leu Gly Val Arg 

HCV-1 f?f AGG GGC CCC AGG TTG GGT GTG CGC 



FIG. 18-1 
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Ala Thr Arg Lys Thr Ser Glu Arg Ser 

Jl 409 6CG ACT A6G AA6 ACT TCC GAG CGG TCG 
HCV-1 G — A — 

Gin Pro Arg Gly Arg Arg Gin Pro lie 

Jl 436 CAA CCT CGT GGA AGG CGA CAA CCT ATC 
HCV-1 — — A — T — A --T — G 

Pro Lys Ala Arg Gin Pro Glu Gly Arg 

Jl 463 CCC AAG GCT CGC GAG CCC GAG GGC AGG 
HCV-1 — ~T -G- — — - „ 

Arg 

Ala Trp Ala Gin Pro Gly Tyr Pro Trp 

Jl 490 GCC TGG GCT CAG CCC GGG TAC CCT TGG 
HCV-1 A 

Thr 

Pro Leu Tyr Gly Asn Glu Gly Met Gly 

Jl 517 CCC CTC TAT GGC AAC GAG GGC ATG GGG 

HCV-1 -— ... «- r ... ... «TGC — 

Cys 

Trp Ala Gly Trp Leu 
Jl 544 TGG GCA GGA TGG CTC CT 
HCV-1 ..6 - 

FIG. 1 8-2 
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